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Description 

Hybrid Proteins Having Cross-Linking and Tissue-Binding 

Activities 

5 Technical Field 

The present invention relates generally toward 
methods for producing recombinant hybrid proteins, and 
more specif ically , to methods for producing hybrid 
proteins from host cells through the use of recombinant 
10 DNA techniques. 

Background of the Invention 

The utilization of tissue sealants to replace or 
augment the use of mechanical wound closure devices has 

15 expanded in recent years in many surgical and trauma 
applications. Tissue sealants include biological 

adhesives (e.g. fibrin-based adhesives) and synthetic 
preparations (e.g. cyanoacrylates) . It is widely 

acknowledged that the use of synthetic preparations of 

20 tissue sealants is limited due to their toxicity and 
limited applications. Biological tissue adhesives have 
demonstrated utility in cases where the use of mechanical 
devices to close wounds is insufficient, such as in 
joining blood vessels, closing holes in the dura, and in 

25 surgery on small or delicate tissues such as in the eye or 
ear . 

Fibrin-based biological tissue adhesives 
generally contain fibrinogen, factor XIII and thrombin as 
principal ingredients, although in practice biological 

30 tissue adhesives are derived from whole blood and contain 
additional blood proteins. The fibrinogen and factor XIII 
components of these adhesives are prepared from pooled 
human plasma by cryoprecipitation (e.g. U.S. Patents No. 
4,377,572; 4,362,567; 4,909,251), by ethanol precipitation 

35 (e.g. U.S. Patent No. 4,442,655) or from single donor 
plasma (e.g. U.S. Patent No. 4,627,879; Spotnitz et al., 
Am . Sura . 55: 166-168, 1989). The resultant 
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f ibrinogen/factor XIII preparation is mixed with bovine 
thrombin immediately before use to convert the fibrinogen 
to fibrin and activate the factor XIII, thus initiating 
coagulation of the adhesive. 
5 Fibrin-based tissue adhesives, in their current 

form, have significant drawbacks that include poor 
standardization, lack of quality control from batch to 
batch and the possibility of transmission of human 
immunodeficiency virus (HIV), hepatitis virus and other 

10 etiologic agents. While recombinant production of 

thrombin and factor XIII have been reported, and while 
these proteins might be used in biological tissue 
adhesives, the biological tissue adhesives still rely on 
large amounts of fibrinogen that is obtained from pooled 

15 human blood. At present, current fibrin (ogen) -based 

tissue adhesives are not approved for use in the United 
States. 

There is therefore a need in the art for tissue 
adhesive components , particularly components that 
facilitate cross-linking to improve clot strength, that 
are prepared at high levels with reproducible activity 
levels and which do not carry the possibility of 
transmission of viral or other etiologic agents. The 
present invention addresses these needs by providing 
recombinant hybrid proteins that provide cross-linking and 
tissue-adhesive properties and that may be prepared at 
high levels. 



20 
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Disclos ure of the Invention 

Briefly stated, the present invention provides 
hybrid proteins having cross-linking and tissue-binding 
activities, DNA molecules encoding such hybrid proteins 
and methods for producing hybrid proteins by recombinant 
means. In one aspect, In one aspect of the invention, the 
3 5 hybrid proteins comprise a tissue-binding domain from a 
first protein covalently linked to a cross-linking domain 
from a second protein. Within a related aspect of the 
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invention, the tissue-binding domain of the first protein 
is a heparin binding domain of thrombospondin, a heparin 
binding domain of fibronectin, a collagen binding domain 
of fibronectin or a cell binding domain of fibronectin. 
5 Within a preferred embodiment, the tissue-binding domain 
of the first protein comprises the amino acid sequence of 
Sequence ID No. 6 from Alanine, amino acid 2 to Glutamic 
acid, amino acid number 92 6. Within another related 
aspect of the invention, the cross-linking domain of the 

10 second protein comprises the carboxy-terminal 103 amino 
acids of loricrin, the ten amino acid repeat beginning 
with glutamine amino acid number 496 of involucrin or the 
4 00 amino-terminal amino acids of the fibrinogen a chain. 
Within a preferred embodiment of the invention, the 

15 tissue-binding domain of the second protein comprises the 
amino acid sequence of Sequence ID No. 6 from Glycine, 
amino acid number 928 to Proline, amino acid number 1336. 
Within a particularly preferred embodiment, the hybrid 
protein comprises the amino acid sequence of Sequence ID 

20 No. 6 from alanine, amino acid number 2 to proline, amino 
acid number 13 36. 

The present invention provides DNA molecules 
encoding hybrid proteins of the present invention 
comprising a first DNA segment encoding a tissue-binding 

25 domain from a first protein joined to a second DNA segment 
encoding a cross-linking domain from a second protein. In 
one embodiment, the first DNA segment comprises the 
nucleotide sequence of Sequence ID No. 5 from nucleotide 3 
to nucleotide 2780. In another embodiment, the second DNA 

3 0 segment comprises the nucleotide sequence of Sequence ID 
No. 5 from nucleotide 2784 to nucleotide 4013. In a 
preferred embodiment, the DNA molecule comprises the 
nucleotide sequence of Sequence ID Number 5 from 
nucleotide 3 to nucleotide 4013. 

3 5 In related embodiments of the invention, DNA 

constructs are provided which comprise a DNA molecule 
encoding a hybrid protein, whereins said DNA molecule 
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comprises a first DNA segment encoding a tissue-binding 
domain from a first protein joined to a second DNA segment 
encoding a cross-linking domain from a second protein and 
wherein said DNA molecule is operably linked to other DNA 
5 segments required for the expression of the DNA molecule, 
other embodiments of the invention concern host cells 
containing the DNA constructs of the present invention and 
methods of producing hybrid proteins . 

10 Brief Description of the Drawings 

Figure 1 discloses a representative hybrid 
protein containing (1) an N-terminal end-to-end inter- 
chain cross-linking domain, (2) a domain that promotes 
inter-chain cross-linking; (3) a domain that confers 

15 tissue binding activity; and (4) a carboxy-terminal domain 
that promotes end-to-end inter-chain cross-linking. 

Figures 2-5 disclose absorbance time courses of 
representative cross-linking assays carried out in the 
presence of varying levels of factor XIII (activated to 

20 factor XHIa via thrombin during the assay) or factor 
XHIa. 

Detailed Description of the Invention 

The present invention provides novel hybrid 

25 proteins having cross-linking and tissue adhesive 
activities. The hybrid proteins comprise a cross-linking 
domain from a first protein covalently linked to a tissue- 
binding domain from a second protein. The hybrid proteins 
of the present invention are capable of cross-linking to 

3 0 themselves and to other proteins such as fibrin and 
fibrinogen and are capable of adhering to cell surfaces 
and/or extracellular matrix components. While not wishing 
to be bound by a graphical representation, Figure 1 shows 
a representative hybrid protein containing an N-terminal 

3 5 end-to-end inter-chain cross-linking domain; a domain that 
promotes inter-chain cross-linking; a domain that confers 
tissue binding activity; and a carboxy-terminal domain 
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that promotes end-to-end inter-chain cross-linking. As 
used herein, cross-linking refers to the formation of 
covalent bonds between polypeptides. 

The hybrid proteins of the present invention are 
5 useful as components of tissue sealant formulations to 
provide matrix material and to improve clot strength over 
a wound site, and as components in formulations that 
promote wound healing. The proteins of the present 
invention may contain native (i.e. wild-type) protein 

10 domains as well as domains that are allelic variants and 
genetically engineered or synthetic variants of the 
respective naturally occurring domains. Such variants are 
characterized by the presence of conservative amino acid 
substitutions and/or other minor additions, substitutions 

15 or deletions of amino acids. 

As used within the context of the present 
invention, tissue-binding domains include protein domains 
containing amino acid sequences that facilitate adherence 
to cell surfaces and/or to extracellular matrix components 

20 such as collagen, fibronectin, hyaluronic acid and 
glycosaminoglycans. Fibronectin, for example, contains 
the sequence Gly-Arg-Gly-Asp-Ser (from amino acid 1614 
through amino acid 1618 of Sequence I.D. No. 3) that has 
been shown to be central to cell recognition by the 

25 fibronectin receptor (for review see Yamada, Current 
Opinion in Cell Biology 1: 956-963, 1989). The heparin 
binding domains of fibronectin (Sekiguchi et al., Proc. 

Natl. Acad. Sci. USA 77.: 2661-2665, 1980) , and 

thrombospondin (Zardi et al., EMBO J . 6: 2337-3342, 1987 

3 0 and Gutman and Kornblihtt, Proc. Natl. Acad. Sci. USA 84 : 
7179-7182, 1987) contain sequences that recognize heparin 
sulf ate-containing glycosaminoglycans which are 

extracellular matrix components. The collagen binding 
domain of fibronectin (Sekiguchi et al. ibid., 1980) 

35 contains amino acid sequences that bind to the 
extracellular matrix component collagen. 
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Particularly preferred tissue-binding domains 
are the heparin binding domain of fibronectin, comprising 
the sequence of amino acids of Sequence I.D. No. 2 from 
alanine, amino acid number 1812 to valine, amino acid 
5 number 2171; the collagen binding domain of fibronectin, 
comprising the sequence of amino acids of Sequence I.D. 
No. 2 from glycine, amino acid number 282 to serine, amino 
acid number 608; and the amino terminal 229 amino acids of 
thrombospondin. in this regard, a particularly preferred 
10 tissue-binding domain is the cell-binding domain of 
fibronectin, comprising the sequence of amino acids of 
Sequence I.D. No. 3 from alanine, amino acid number 1357 
to glutamic acid, amino acid number 1903. It will be 
evident to one skilled in the art that smaller portions of 
15 the cell-binding domain of fibronectin may be used within 
the hybrid proteins of the present invention, more 
particularly the sequence of amino acids of Sequence I.D. 
No. 3 from isoleucine, number 153 2 through threonine, 
amino acid number 1631. As noted above, it is generally 
20 accepted that the sequence Gly-Arg-Gly-Asp-Ser (Amino 

acids 1614 to 1618 of Sequence I.D. No. 3) is central to 
cell recognition by fibronectin. 

Cross-linking domains suitable for use in the 
hybrid proteins of the present invention are protein 
25 domains which contain amino acid sequences required for 
the formation of specific covalent bonds between peptide 
chains. In a preferred embodiment the inter-chain cross- 
links are covalent bonds formed by the action of a 
transglutaminase such as factor XIII, tissue 
30 transglutaminase, prostate transglutaminase, keratinocyte 
transglutaminase, epidermal transglutaminase or placental 
transglutaminase. Transglutaminases catalyze the 

formation of e- (7-glutamyl) lysine bonds between specific 
glutamine and lysine residues. However, other inter-chain 
35 cross-links, such as those formed by disulfide bonds, are 
also suitable cross-links. Suitable cross-linking domains 
include domains from the fibrinogen a chain, the 
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glutamine/ lysine rich domains of loricrin that are 
involved in isodipeptide cross-link formation (Hohl et 
al., J. Bio l. Chem. 266 ; 6626-6636, 1991), and at least 
one of the 10 amino acid-long repeats of involucrin ( Cell 
5 4_6: 583-589, 1986 and Etoh et al., Biochem. Bioohvs. Res. 
Comm 136: 51-56, 1986) . Preferred cross-linking domains 
are the carboxy-terminal 103 amino acids of loricrin (Hohl 
et al., ibid.) and the ten-amino acid repeat beginning 
with glutamine, amino acid number 496 of involucrin (Simon 

10 et al. ( J. Biol. Chem. 263 : 18093-18098, 1988). A 
particularly preferred cross-linking domain comprises the 
400 amino-terminal amino acids of the fibrinogen a chain 
(Doolittle et al., Nature 280 : 464-468, 1979; Rixon et 
al., Biochemistry 22.: 3250-3256, 1983). More 

15 particularly, the amino acid sequence of Sequence ID No. 6 
from Glycine, amino acid number 928 to Proline, amino acid 
number 1336 is preferred. 

Although the hybrid proteins of the present 
invention may consist essentially of covalently linked 

20 cross-linking and tissue binding domains, they may further 
contain domains that facilitate end-to-end covalent cross- 
linking. The 7 chain of fibrinogen contains a domain that 
facilitates end-to-end cross-linking to another 7 chain 
via f- (7-glutamyl) lysine bonds. This domain includes at 

25 least the 19 carboxy-terminal amino acids and more 
preferably includes the amino-terminal 275 amino acids of 
the fibrinogen 7 chain. The a chain of fibrinogen contains 
an amino-terminal domain that is involved in interchain 
disulfide bond formation between a chains. This domain 

3 0 includes the amino-terminal portion of the a chain of 
fibrinogen from glycine, amino acid 36 to glycine, amino 
acid 67 of Sequence ID Number 4. 

As will be evident to one skilled in the art, 
the hybrid proteins of the present invention may contain 

3 5 domains of human and other animal proteins. Proteins 
containing domains suitable for use in the present 
invention from human and other animals and the ON A 
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molecules encoding such proteins have been reported. 
Involucrin, loricrin, fibrinogen and fibronectin, for 
example, have been studied in a variety of animals. DNA 
sequences encoding primate, canine and porcine involucrin 
5 have been reported (Djian and Green, Mol. Biol. Evol. 9: 
417-432, 1992; Djian and Green, Proc. Natl. Acad. Sci. USA 
88.: 5321-5325, 1991 and Tseng and Green, Mol- Biol. Evol. 
7: 293-302, 1990). Mehrel et al. ( Cell 61 : 1103-1112, 
1990) have reported a DNA sequence encoding mouse 

10 loricrin. DNA sequences encoding rat and frog fibrinogen 
gamma chain have been reported (Haidaris and Courtney, 
Blood 79: 1218-1224, 1992 and Bhattacharya et al., Mol. 

c ell- Endocrinol. 72: 213-220, 1990; respectively) . DNA 

sequences encoding chicken and lamprey fibrinogen a chains 

15 have been reported by Weissbach and Greininger ( Proc. 

Natl . Acad . Sci . USA 87: 5198-5202, 1990) and Pan and 
Doolittle ( Proc. Natl. Acad. Sci. USA 89 : 2066-2070, 
1992) , respectively. DNA sequences encoding bovine and 
rat fibronectin have been reported by Petersen et al. 

20 (Proc. Natl. Acad. Sci. USA 80: 137-141, 1983) and 

Schwarzbauer et al., f Cell 35 : 421-431, 1983). In 
general, it is preferred to prepare proteins that contain 
component domains from a single species to minimize the 
possibility of immunogenicity . Thus, the present 

25 invention provides hybrid proteins that can be used in 
human and veterinary medicine. 

According to the present invention hybrid 
proteins having cross-linking and tissue adhesive 
activities are produced recombinantly from host cells 

30 transformed with a DNA construct comprising a DNA segment 
encoding a cross-linking domain from a first protein 
joined to a DNA segment encoding a tissue-binding domain 
from a second protein. As used within the context of the 
present invention, two or more DNA coding sequences are 
35 said to be joined when, as a result of in-frame fusions 
between the DNA coding sequences or as a result of the 
removal of intervening sequences by normal cellular 
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processing, the DNA coding sequences can be translated 
into a polypeptide fusion. Unless otherwise noted, the 
DNA segments may be joined in any order to result in a DNA 
coding sequence that can be translated into a polypeptide 
5 chain. Thus, the DNA segment encoding the tissue-binding 
domain may be joined to the 5' or the 3' end of the DNA 
segment encoding the cross-linking domain. However, as 
will be evident to one skilled in the art, the production 
of hybrid proteins that additionally include domains that 

10 facilitate end-to-end cross-linking will require that the 
DNA segments encoding such domains be positioned at the 5' 
and 3' termini of the molecules. 

Thus the present invention also provides 
isolated DNA molecules encoding hybrid proteins comprising 

15 a cross-linking domain from a first protein covalently 
linked to a tissue-binding domain from a second protein. 
In general, cDNA sequences are preferred for carrying out 
the present invention due to their lack of intervening 
sequences which can lead to aberrant RNA processing and 

20 reduced expression levels. DNA molecules encoding human 
fibronectin (Dufour et al., Exper. Cell Res. 193 : 331-338, 
1991) and a human fibrinogen or chain (Rixon et al . , 
Biochemistry 22: 3250-3256, 1983) may be obtained from 
libraries prepared from liver cells according to standard 

2 5 laboratory procedures. It will be understood however, 
that suitable DNA sequences can also be obtained from 
genomic clones or can be synthesized de novo according to 
conventional procedures. If partial clones are obtained, 
it is necessary to join them in proper reading frame to 

30 produce a full length clone,, using such techniques as 
endonuclease cleavage, ligation, and loop-out mutagenesis. 

DNA sequences encoding hybrid proteins of the 
present invention may be prepared from cloned DNAs using 
conventional procedures of endonuclease cleavage, 

35 exonuclease digestion, ligation and in vitro mutagenesis. 

Alternatively, DNA sequences encoding the cross-linking 
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and tissue-binding domains, such as those mentioned above, 
may be synthesized using standard laboratory techniques. 

An exemplary DNA molecule encoding a hybrid 
protein having cross-linking and tissue-binding activities 
5 may be prepared by joining a DNA segment encoding at least 
the cell-binding domain of fibronectin and a DNA segment 
encoding at least an inter-chain cross-linking domain of 
fibrinogen at a convenient restriction site using 
synthetic adapters to facilitate in-frame joining of the 

10 DNA segments. Alternatively, such DNA segments encoding 
hybrid proteins of the present invention may be prepared 
by joining the two domains at a convenient restriction 
site followed by loop-out mutagenesis to precisely remove 
unnecessary sequences and directly join the DNA segment 

15 encoding the cell-binding domain of fibronectin with the 
DNA segment encoding the cross-linking domain of 
fibrinogen . 

DNA segments encoding the hybrid proteins of the 
instant invention are inserted into DNA constructs. As 

20 used within the context of the present invention, a DNA 
construct is understood to refer to a DNA molecule, or a 
clone of such a molecule, either single- or double- 
stranded, which has been modified through human 
intervention to contain segments of DNA combined and 

25 juxtaposed in a manner that would not otherwise exist in 

nature. DNA constructs of the present invention comprise 
a first DNA segment encoding a hybrid protein operably 
linked to additional DNA segments required for the 
expression of the first DNA segment. Within the context 

30 of the present invention, additional DNA segments will 
generally include promoters and transcription terminators, 
and may further include enhancers and other elements. 

DNA constructs may also contain DNA segments 
necessary to direct the secretion of a polypeptide or 

3 5 protein of interest. Such DNA segments may include at 
least one secretory signal sequence. Secretory signal 
sequences, also called leader sequences, prepro sequences 
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and/ or pre sequences, are amino acid sequences that act to 
direct the secretion of mature polypeptides or proteins 
from a cell. Such sequences are characterized by a core 
of hydrophobic amino acids and are typically (but not 
exclusively) found at the amino termini of newly 
synthesized proteins. DNA segments encoding secretory 
signal sequences are placed in-frame and in the correct 
spatial relationship to the DNA segment encoding the 
protein of interest in order to direct the secretion of 
the protein. Very often the secretory peptide is cleaved 
from the mature protein during secretion. Such secretory 
peptides contain processing sites that allow cleavage of 
the secretory peptides from the mature proteins as they 
pass through the secretory pathway. a preferred 

processing site is a dibasic cleavage site, such as that 
recognized by the Saccharomyces cerevisiae KEX2 gene. A 
particularly preferred processing site is a Lys-Arg 
processing site. Processing sites may be encoded within 
the secretory peptide or may be added to the peptide by, 
for example, _in vitro mutagenesis. 

Preferred secretory signals include the a factor 
signal sequence (pre-pro sequence: Kurjan and Herskowitz, 
Cell 30: 933-943, 1982; Kurjan et al . , U.S. Patent No. 
4,546,082; Brake, U.S. Patent No. 4,870,008), the PHQ5 
signal sequence (Beck et al., WO 86/00637), the BAR1 
secretory signal sequence (MacKay et al., U.S. Patent No. 
4,613,572; MacKay, WO 87/002670), the SUC2 signal sequence 
(Carlsen et al., Molecular and Cellular Biology 2* 439- 
447, 1983) . Alternately, a secretory signal sequence may 
be synthesized according to the rules established, for 
example, by von Heinje f European Journal of Biochemistry 
13_3: 17-21, 1983 ; Journal of Molecular Biology 184 ; 99- 
105, 1985; Nucleic Acids Research 14: 4683-4690, 1986). 

Secretory signal sequences may be used singly or 
may be combined. For example, a DNA segment encoding a 
first secretory signal sequence may be used in combination 
with a DNA segment encoding the third domain of barrier 
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(described in U.S. Patent No. 5,037,243, which is 
incorporated by reference herein in its entirety) . The 
DNA segment encoding the third domain of barrier may be 
positioned in proper reading frame 3' of the DNA segment 
5 of interest or 5 ' to the DNA segment and in proper reading 
frame with both the DNA segment encoding the secretory 
signal sequence and the DNA segment of interest. 

The choice of suitable promoters, terminators 
and secretory signals is well within the level of ordinary 

10 skill in the art. Methods for expressing cloned genes in 
Saccharomvces cerevisiae are generally known in the art 
(see, "Gene Expression Technology," Methods in Enzvmoloqy , 
Vol. 185, Goeddel (ed. ) , Academic Press, San Diego, CA, 
1990 and "Guide to Yeast Genetics and Molecular Biology," 

15 Method s in Enzvmoloqy , Guthrie and Fink (eds.), Academic 
Press, San Diego, CA, 1991; which are incorporated herein 
by reference). Transformation systems for other yeasts, 

including Hansenula polvmorpha . Schizosaccharomvces pombe , 
Kluvveromvces lactis, Kluvveromvces fragilis . Ustilaqo 

20 mavdis , Pichia pastoris , Pichia quillermondil and Candida 
maltosa are known in the art. See, for example, Gleeson 
et al., J. Gen, Microbiol. 132:3459-3465, 1986 and Cregg, 
U.S. Patent No. 4,882,279. 

Proteins of the present invention can also be 

25 expressed in filamentous fungi, for example, strains of 
the fungi Aspergillus (McKnight et al . , U.S. Patent No. 
4,935,349, which is incorporated herein by reference). 
Methods for transforming Acremonium chrvsoaenuin are 
disclosed by Sumino et al., U.S. Patent No. 5,162,228, 

3 0 which is incorporated herein by reference. 

Other higher eukaryotic cells may also be used 
as hosts, including insect cells, plant cells and avian 
cells. Transformation of insect cells and production of 
foreign proteins therein is disclosed by Guarino et al., 

35 U.S. Patent No. 5,162,222 and Bang et al., U.S. Patent No. 

4,775,624, which are incorporated herein by reference. 
The use of Acrrobacter ium rhizogenes as a vector for 



WO 94/16085 , , PCT/US93/12687 



13 



expressing genes in plant cells has been reviewed by 
Sinkar et al., J ■ Biosci. (Bangalore) 11:47-58, 1987. 

Expression of cloned genes in cultured mammalian 
cells and in E . coli , for example, is discussed in detail 
5 in Sambrook et al. ( Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor, NY, 1989; 
which is incorporated herein by reference) . In addition 
to E . coli . Bacillus and other genera are useful 

prokaryotic hosts for expressing foreign proteins. As 

10 would be evident to one skilled in the art, one could 
express the proteins of the instant invention in other 
host cells such as avian, insect and plant cells using 
regulatory sequences, vectors and methods well established 
in the literature. 

15 In yeast, suitable vectors for use in the 

present invention include YRp7 (Struhl et al., Proc. Natl. 
Acad. S ci. USA 76 : 1035-1039, 1978), YEpl3 (Broach et al., 
Gene 8.: 121-133, 1979), POT vectors (Kawasaki et al , U.S. 
Patent No. 4,931,373, which is incorporated by reference 

20 herein), pJDB249 and pJDB219 (Beggs, Nature 2^5:104-108, 
1978) and derivatives thereof. Preferred promoters for 
use in yeast include promoters from yeast glycolytic genes 
(Hitzeman et al . , J. Biol. Chem. 255 : 12073-12080, 1980; 
Alber and Kawasaki, J . Mol. AppI. Genet. 1: 419-434, 1982; 

25 Kawasaki, U.S. Patent No. 4,599,311) or alcohol 
dehydrogenase genes (Young et al., in Genetic Engineering 
of Mi croorganisms for Chemicals , Hollaender et al., 
(eds.)r P. 355, Plenum, New York, 1982; Ammerer, Meth . 
Enzymol. 101 : 192-201, 1983). In this regard, 

30 particularly preferred promoters are the TPIl promoter 
(Kawasaki, U.S. Patent No. 4,599,311, 1986) and the ADH2- 
4- promoter (Russell et al., Nature 304 : 652-654, 1983 ; 
Irani and Kilgore, U.S. Patent Application Serial No. 
07/631,763, CA 1,304,020 and EP 284 044, which are 

35 incorporated herein by reference) . The expression units 
may also include a transcriptional terminator. A 



WO 94/16085 , PCJT/US93/12687 



14 



preferred transcriptional terminator is the TPI1 
terminator (Alber and Kawasaki, ibid.). 

Host cells containing DNA constructs of the 
present invention are then cultured to produce the hybrid 
5 proteins. The cells are cultured according to standard 
methods in a culture medium containing nutrients required 
for growth of the particular host cells. A variety of 
suitable media are known in the art and generally include 
a carbon source, a nitrogen source, essential amino acids, 

10 vitamins, minerals and growth factors. The growth medium 
will generally select for cells containing the DNA 
construct by, for example, drug selection or deficiency in 
an essential nutrient which is complemented by a 
selectable marker on the DNA construct or co-transf ected 

15 with the DNA construct. 

Selection of a medium appropriate for the 
particular host cell used is within the level of ordinary 
skill in the art. Yeast cells, for example, are 

preferably cultured in a chemically defined medium, 

2 0 comprising a non-amino acid nitrogen source, inorganic 

salts, vitamins and essential amino acid supplements. The 
pH of the medium is preferably maintained at a pH greater 
than 2 and less than 8, preferably at pH 6.5. Methods for 
maintaining a stable pH include buffering and constant pH 
25 control, preferably through the addition of sodium 
hydroxide or ammonium hydroxide. Preferred buffering 
agents include succinic acid and Bis-Tris (Sigma Chemical 
Co., St. Louis, MO). Yeast cells having a defect in a 
gene required for asparagine-linked glycosylation are 

3 0 preferably grown in a medium containing an osmotic 

stabilizer. A preferred osmotic stabilizer is sorbitol 
supplemented into the medium at a concentration between 
0.1 M and 1.5 M, preferably at 0.5 M or 1.0 M. Cultured 
mammalian cells are generally cultured in commercially 
35 available serum-containing or serum-free media. 

The recombinant hybrid proteins expressed using 
the methods described herein are isolated and purified by 
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conventional procedures, including separating the cells 
from the medium by centrif ugation or filtration, 
precipitating the proteinaceous components of the 
supernatant or filtrate by means of a salt, e.g. ammonium 
5 sulfate, purification by a variety of chromatographic 
procedures, e.g. ion exchange chromatography or affinity 
chromatography, or the like. Methods of protein 

purification are known in the art (see generally, Scopes, 
R. , Protein Purification . Springer-Verlag , NY (1982), 

10 which is incorporated herein by reference) and may be 
applied to the purification of the recombinant proteins of 
the present invention. 

The hybrid proteins of the present invention may 
be used as components of tissue adhesives. It is 

15 preferred that the tissue adhesives be formulated to 
provide a concentration of the hybrid proteins of the 
present invention of between about 5 mg/ml to 100 mg/ml, 
with concentrations in the range of 35 to 50 mg/ml being 
particularly preferred. As disclosed above, tissue 

20 adhesives generally contain factor XIII and thrombin. 

Additional components may also be included in the tissue 
adhesive formulations. These additional components 

include growth factors such as PDGF , bFGF, TGFa, or EGF 
and protease inhibitors, such as aprotinin, transexamic 

25 acid, alpha-2 plasmin inhibitor, alpha-l-antitrypsin or 
the Pittsburgh mutant of alpha-l-antitrypsin (Arg-358 
alpha-l-antitrypsin) . The tissue adhesives may also 

contain salts, buffering agents, reducing agents, bulking 
agents, and solubility enhancers. Albumin, NaCl, CaCl2, 

30 citrate and phosphate buffers, for example, may be 
included. Preferably, the tissue adhesives of the present 
invention are prepared as lyophilized powders, liquid 
concentrates of ready-to-use liquids. Lyophilized powders 
are preferred for ease of handling and storage. 

35 The following examples are offered by way of 

illustration and not by way of limitation. 
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EXAMPLES 

Example 1 - Subclonino and Modification of ADH2 Promoters 

An ADH2-4 — promoter was constructed as described 
5 in co-pending U.S. Patent Application 07/631,763, CA 
1,304,020 and EP 284 044, which are incorporated herein by 
reference. A DNA construct comprising the complete ADH2- 
4— promoter mutagenized at the 3' end to place an Eco RI 
site in place of the translation start codon, designated 

10 p410-4 c (deposited with the American Type culture 
Collection (12301 Parklawn Dr., Rockville, MD 20852) under 
accession number 68861) was used as the source of the 
ADH2-4 — promoter. 

A PAP-I cDNA (U.S. Patent No. 4,937,324) was 

15 joined with the ADH2-4 — promoter. Plasmid pAP1.7, 

comprising the 1.7 kb cDNA in pUC18, was cut with Nco I 
and Bam HI, and the linearized plasmid was isolated 
through two rounds of gel purification. The ADH2-4 — 
promoter from p410-4 c was joined to the 5' end of the PAP- 

2 0 I cDNA via an Eco RI-Nco I adapter. The 1.2 kb Bam HI-Eco 

RI promoter fragment from p410-4 c , Eco RI-Nco I adapter 
and the Nco I -Bam HI linearized pAP1.7 plasmid were 
ligated. The resultant plasmid was designed pPRl . The 
presence of the correct promoter fusion was confirmed by 

2 5 DNA sequencing. 

A yeast expression vector comprising the ADH2-4 — 
promoter, the PAP-I cDNA and the TPI1 terminator was 
constructed. Plasmid pZUC13 (comprising the S_^ cerevisiae 
chromosomal LEU2 gene and the origin of replication from 

30 S. cerevisiae 2 micron plasmid inserted into pUC13 and 
constructed in a manner analogous to p2UC12, described in 
published EP 195,691, using the plasmid pMT2l2, which is 
described in published EP 163 529) was cut with Bam HI. 
Plasmid pPRl was digested completely digested with Bam HI 

35 and partially digested with Sac I to isolate the 2.1 kb 
ADH2-4 - promoter-PAP-I cDNA fragment. Plasmid pTTl 

(described in detail below) was digested with Sac I and 
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Bam HI to isolate the 0,69 bp TPI1 terminator fragment. 

The Bam Hi-Sac I fragment from pPRl and the Sac I-Bam HI 

fragment from pTTl were ligated with the Bam Hl-linear ized 

pZUC13. A plasmid containing the expression unit was 
5 designated pZ3 . 

Example 2 - Subcloning of the TPI1 terminator 

The yeast TPI1 terminator fragment was obtained 
from plasmid p270 described by Murray and Kelly (U.S. 

10 Patent 4 ,766, 073, which is incorporated by reference 
herein in its entirety). Plasmid p270 contains the TPI1 
terminator inserted as and Xba I-Bam HI fragment into 
YEpi3. Alternatively, the TPIl terminator may be obtained 
from plasmid pM22 0 (deposited with American Type Culture 

15 Collection as an E. coli RR1 transformant under accession 
number 39853) by digesting the plasmid with Xba I, and Bam 
HI and purifying the TPIl terminator fragment (-700 bp) . 

The TPIl terminator was removed from plasmid 
p270 as a Xba I-Bam HI fragment. This fragment was cloned 

2 0 into pUC19 along with another fragment containing the TPIl 

promoter fused to the CAT (chloramphenicol acetyl 
transferase) gene to obtain a TPIl terminator fragment 
with an Eco RV end. The resultant plasmid was designated 
pCAT. The TPIl terminator was then cut from pCAT as an 
25 Eco RV-Bam HI fragment and cloned into pIC19H (Marsh et 
al., Gene 3_2-481-486, 1984) which had been cut with the 
same enzymes, to obtain pTTl (disclosed in U.S. Patent No. 
4,937,324, which is incorporated herein by reference). 

3 0 Example 3 - Construction of Yeast Vectors pDPOT and 

pRPOT 

Plasmid pDPOT was derived from plasmid pCPOT 
(ATCC No. 39685) by replacing the 750 bp Sph I-Bam HI 
fragment of pCPOT containing 2 micron and pBR32 2 sequences 
3 5 with a 186 bp Sph I-Bam HI fragment derived from the 
pBR322 tetracycline resistance gene. 
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Plasmid pRPOT was derived from plasmid pDPOT by 
replacing the Sph I-Bam HI fragment with a polylinker. 
Plasmid pDPOT was digested with Sph I and Bam HI to 
isolate the 10.8 kb fragment. Oligonucleotides 2C1551 and 
5 ZC1552 (Sequence ID Nos . 7 and 8) were designed to form an 
adapter with a Bam HI adhesive end and an Sph I adhesive 
end flanking Sma I, Sst I and Xho I restriction sites. 
Oligonucleotides ZC1551 and ZC1552 (Sequence ID Nos. 7 and 
8) were kinased and annealed to form the Bam Hl-Sph I 
10 adapter. The 10.8 kb pDPOT fragment was circularized by 
ligation with the ZC1551/ ZC1552 adapter (Sequence ID Nos. 
7 and 8) . The resultant plasmid was termed pRPOT. 

Example 4 - Construction of a Fibrinogen ; Fibronectin 

15 Hybrid cDNA Expression Vector 

A. Construction of pFN14A 

A DNA construct containing a DNA segment 
encoding the fibronectin cell-binding domain operably 
linked to the ADH2-4 — promoter in plasmid pUC19 was 

2 0 constructed. The fibronectin coding sequence was obtained 
from plasmid pFH103 (Dufour et al., Exper. Cell Res. 193 : 
331-338, 1991). Plasmid pFH103 was digested with Nco I and 
Xba I to isolate the 4 kb fragment containing the 
fibronectin coding sequence. Oligonucleotides ZC2052 and 

25 ZC2053 (Sequence ID Nos. 9 and 10) were designed to 

provide, upon annealing, an adapter containing a 5' Eco RI 
adhesive end, an internal Nco I site, a DNA segment 
encoding a methionine and amino acids 979 through 981 of 
Sequence ID Number 2 and a 3' Nco I adhesive end that 

30 destroys the Nco I site. Oligonucleotides ZC2052 and 
ZC2053 (Sequence ID Nos. 9 and 10) were annealed and 
ligated with the 4 kb Nco I-Xba I fibronectin fragment 
into Eco Rl-Xba I linearized pUC19. The resultant plasmid 
was designated pFN4 . 

35 Plasmid pFN4 was digested with Hind III and Apa 

I to isolate the 3.3 kb fibronectin fragment. 
Oligonucleotides ZC2493 and ZC2491 (Sequence ID Nos. 12 
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and 11) were designed to provide, when annealed, an Apa I- 
Xba I adapter encoding the amino acids Pro and Phe 
followed by a stop codon. The oligonucleotides were 
annealed and combined with the 3,3 kb Hind III-Apa 1 
5 fragment and Hind III-Xba I linearized pUC19 to form 
plasmid pFN7 . Plasmid pFN7 comprises a DNA segment 
encoding amino acids 1273-2186 of Sequence ID Number 2 
followed by an in-frame stop codon. 

The ADH2-4— promoter was joined to the 5' end of 

10 the fibronectin cDNA in plasmid pFN5 . Plasmid pFN4 was 
digested with Nco I and Hind III to isolate the 0.89 kb 
fibronectin coding sequence. Plasmid pZ3 (described in 
detail above) was digested with Bam HI and Nco I to 
isolate the 1.25 kb ADH2-4 ^ promoter fragment. The 1.25 

15 kb Bam HI-Nco I promoter fragment and the Nco I-Hind III 
fibronectin coding sequence fragment were ligated to Bam 
HI-Hind III linearized pUC19 to form plasmid pFN5. 

Plasmid pFN5 was digested with Bam HI and Hind 
III to isolate the 2.1 kb promoter-f ibronectin fragment. 

20 Plasmid pFN7 was digested with Hind III and Xba I to 
isolate the 2,8 kb fibronectin fragment that was modified 
to encode a stop codon following the Pro-Phe sequence. 
The TPI1 terminator sequence was obtained from pTTl as a 
0.7 kb Xba I-Sal I fragment. The 2.1 kb Bam HI-Hind III 

25 promoter-f ibronectin fragment, the 2.8 kb Hind III-Xba I 
fibronectin fragment and the 0.7 kb TPI1 terminator 
fragment were joined in a four-part ligation with Bam HI- 
Xho I linearized pRPOT. A plasmid containing the 

fibronectin expression unit in the pRPOT vector was 

30 designated pRl . 

The original clone pFH103 contained a frame- 
shift mutation in the EIIIB region of the fibronectin 
cDNA. The mutation was corrected by the replacement of 
the region with an analogous region from the plasmid pFHA3 

35 (obtained from Jean Paul Thiery, Laboratoire de 
Physiopathologie du Developpement , CNRS URA 1337, Ecole 
Normale Superiure, 46 rue d'Ulm, 75230 Paris Cedex 05, 
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20 



France) . Plasmid pFHA3 was derived from pFH103 by 

excising the 3211 bp Xba I-Asp 7181 fragment of 
fibronectin, blunting of the resultant adhesive ends and 
religating. Plasmid pFHA3 contains a DNA segment encoding 
the signal and propeptides, the first three and one half 
type I repeats, and the carboxy-terminal half of human 
fibronectin from the middle of the EIIIB segment. 

Plasmid pRl was digested with Bam HI and Kpn I 
to isolate the 2.2 kb promoter-f ibronectin fragment. 
Plasmid pFHA3 was digested with Kpn I and Apa I to isolate 
the internal fibronectin fragment that corrects the frame- 
shift mutation present in the parent cDNA from pFH103. 
Plasmid pRl was digested with Apa I and Bam HI to isolate 
the TPI1 terminator fragment. The 2.2 kb Bam HI-Kpn I 
15 promoter-f ibronectin fragment, the 2.7 5 kb Kpn I-Apa I 
internal fibronectin fragment and the 0.69 kb Apa I -Bam HI 
TPI1 terminator fragment were joined in a four-part 
ligation with Bam Hl-linearized pDPOT. The resulting 
construction was designated pD32. 
20 A DNA segment encoding the ADH2— 4 ~ promoter and 

initiation methionine from plasmid pD32 was subcloned into 
PIC19H (Marsh et al., Gene 32:481-486, 1984) as a 1.25 kb 
Bam HI-Nco I fragment. Plasmid pD32 was also digested 
with Nco I and Bgl II to isolate the 3 kb fibronectin cDNA 
25 fragment encoding amino acids 979-1972 of Sequence ID 
Number 2. The 1.25 kb Bam HI-Nco I fragment and the Nco 
I-Bgl II fragment were ligated with Bam Hl-linear ized 
pIC19H. A plasmid containing a Bam HI site proximal to 
the ADH2-4 — promoter was designated pFNl4A. 

30 

B. - Construction of Plasmid pD38 

An expression vector comprising a DNA segment 
encoding a f ibronectin-f ibrinogen hybrid protein operably 
linked to the ADH2-4 — promoter and the TPI1 terminator was 
35 constructed. To assemble the DNA sequence encoding the 
hybrid protein, a DNA segment encoding approximately the 
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carboxy-terminal 409 amino acids of the a chain of 
fibrinogen was first subcloned. 

A fibrinogen a chain cDNA was obtained from 
Dominic W. Chung (Department of Biochemistry, University 
5 of Washington, Seattle, WA) in plasmid pHIa3 (Rixon et 
Biochemistry 22.: 3250-3256, 1983), Sequence analysis 
of the cDNA insert in plasmid pHIa-3 revealed a deletion 
of codons 1348-1350 of the published sequence resulting in 
the deletion of Serine, amino acid 417. 

1° The DNA segment encoding the carboxy-terminus of 

the fibrinogen a chain was subcloned into plasmid pUC19. 
Plasmid pHIa-3 was digested with Asp 718 and Ssp I to 
isolate the approximately 2 kb fragment encoding the 
carboxy-terminus of the fibrinogen or chain from amino acid 

15 244 to amino acid 643 and some 3' untranslated sequence of 
Sequence ID Number 4 . Plasmid pTTl was digested with Eco 
RV and Sal I to isolate the approximately 700 bp TPI1 
terminator fragment. The 2 kb fibrinogen a chain sequence 
and the TPI1 terminator sequence were ligated with pUCl9 

20 that had been linearized with Asp 718 and Sal I. The 
ligation mixture was transformed into E. coli , and plasmid 
DNA was prepared and analyzed by restriction endonuclease 
and DNA sequence analysis. DNA sequence analysis of a 
candidate clone revealed that the Sal I site joining the 

25 TPI1 terminator sequence and the pUC19 polylinker site was 
not present. Plasmid DNA from the candidate clone was 
digested with Asp 718 and Bam HI to liberate the 
approximately 1.9 kb f ibrinogen- TPIl terminator fragment. 

To join the fibronectin coding sequence with the 

30 fibrinogen a chain sequence, synthetic oligonucleotides 
were synthesized to provide, when annealed, a Sal I-Asp 
718 adapter encoding an internal Afl II restriction site, 
and a sequence encoding amino acids 1886 through 1903 of 
fibronectin (Sequence ID Number 2) , a glycine residue and 

35 amino acids 235 through 243 of the fibrinogen a chain 
(Sequence ID Number 4) . Oligonucleotides ZC3521 and 
2C3522 (Sequence ID Nos. 13 and 14) were annealed. The 



WO 94/16085 



22 



PCJ/US93/12687 



1.9 kb Asp 718-Bam HI f ibr inogen-TPIl terminator fragment 
and the Sal I-Asp 718 ZC3521/ZC3522 adapter (Sequence ID 
Nos. 13 and 14) were ligated with pUC19 that had been 
linearized with Sal I and Bam HI. The resultant plasmid 
5 was designated pFG4 . 

The DNA segment encoding the fibronectin- 
fibrinogen a chain sequence in plasmid pFG4 was joined 
with the DNA segment encoding the amino-terminal 
fibronectin sequence (from amino acid 989 to amino acid 

10 1885 of Sequence ID Number 2) in plasmid pFN14A to 
construct plasmid pD37. Plasmid pFN14A was digested with 
Bgl II and Afl II to isolate the approximately 3.9 kb 
ADH2-4 — promoter-f ibronectin fragment. Plasmid pFG4 was 
digested with Afl II and Bam HI to isolate the 

15 approximately 2 kb f ibronectin-f ibr inoqen- TPI 1 terminator 
fragment. The 3.9 kb Bgl II-Afl II fragment and the 2 kb 
Afl II -Bam HI fragment were ligated with Bam Hl-linearized 
pDPOT. A plasmid with the expression unit inserted with 
the direction of transcription in the same direction as 

20 the POT1 gene in the pDPOT vector was designated pD37. 

To place the expression unit present in pD37 in 
the opposite orientation, such that the direction of 
transcription of the expression unit was in the opposite 
direction to that of the POT1 gene, plasmid pD37 was 

25 digested with Nco I and Xba I to isolate the approximately 
4 kb f ibronectin-f ibrinogen a chain fragment. Plasmid 
pFN14A was digested with Bam HI and Nco I to isolate the 
approximately 1.3 kb ADH2-4 ^ promoter fragment. Plasmid 
pTTl was digested with Bam HI and Xba I to isolate the 

3 0 approximately 700 bp TPI1 terminator fragment. The Bam 
HI-Nco I ADH2-4- promoter fragment, the Nco I-Xba I 
f ibronectin-f ibrinogen a chain fragment and the Xba I-Bam 
HI TPI1 terminator fragment were ligated with Bam HI- 
linearized pDPOT that had been treated with calf alkaline 

35 phosphatase to prevent recircularization . A plasmid 
containing the expression unit in the opposite orientation 
relative to the POT1 gene was designated pD38. The 
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nucleotide sequence and deduced amino acid sequence of the 
DNA segment encoding the f ibronectin-f ibrinogen hybrid of 
plasmid pD3 8 is shown in Sequence ID Number 5. Plasmid 
pD38 was deposited on December 15, 1992 with the American 
5 Type Culture Collection (12301 Parklawn Drive, Rockville, 
MD) as an E. coli transf ormant . 

Example 5 - Expression of a Fibronect in-Fibr inogen 
Hybrid Protein in Yeast 

10 Plasmid pD38 was transformed into the 

Saccharoravces cerevisiae host strain 2M118 f MAT a/ MAT Q 
ura3 / ura3 Atpil: :URA3 / Atpil: : URA3 leu2-3 . 112 / leu2-3 , 112 
barl / barl pep4 : : URA3 / pep4 : : URA3 [cir°]) using essentially 
the method described by Hinnen et al. ( Proc. Natl. Acad. 

15 Sci. USA 75: 1929-1933, 1978). Transf ormants were 

selected for their ability to grow on medium containing 
glucose as the sole carbon source. 

The ZM118[pD38] transf ormant was scaled up in a 
60 liter fermenter to facilitate purification of the 

20 hybrid protein. A single ZM118[pD38] colony was selected 
from a YEPD + Ade + Leu plate (Table 1) and inoculated 
into -LeuTrpThrD medium (Table 1) . The culture was 
incubated for approximately 52 hours after which the cells 
were harvested. The cells were washed in T.E. buffer 

25 (Sambrook et al., ibid.), resuspended in T.E. buffer + 

30% glycerol, and aliquotted into 1 ml seed vials. The 
seed vials were stored at -80°C. one seed vial was used 
to inoculate 100 ml of YEPD + Ade + Leu (Table 1) . The 
culture was grown for approximately 28 hours to a final 

30 A 660 of 7.7. The 100 ml culture of ZM118[pD38] was 
inoculated into a 10 liter fermenter with a final volume 
of 6.0 liters of medium containing 10 g/L (NH4) 2 S0 4 , 5 g/L 
KH 2 P0 4 , 5 g/L MgS0 4 -7H 2 0, 1 g/L NaCl , 0.5 g/L CaCl 2 '2H 2 0, 
3.68 g/L A.A.I. (Table 1), 4.2 g/L citric acid, 60 g/L 

35 glucose, 10 ml/L Trace Metal Solution (Table 1) , 0.4 ml/L 
PPG-2025 (Polypropylene glycol, MW 2025, Union Carbide 
Corp, Danbury, CT) that had been pH adjusted to pH 5.0 
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with NaOH . In addition to the inoculation culture, 30 ml 
of Vitamin solution was added (Table l) . The culture was 
grown for 23 hours at 30°C with the addition of 2 M NaOH 
to maintain pH of approximately 5. 
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Table 1 
Media Recipes 

-LeuThrTrp Amino Acid Mixture 



4 


rr 




3 


g 


L-arginine 


5 


g 


L-aspartic acid 


2 


g 


L-histidine free base 


6 


g 


L-isoleucine 


4 


g 


L-lysine-mono hydrochloride 


2 


g 


L-met h i on i ne 


6 


g 


L~phenylalanine 


5 


g 


L-serine 


5 


g 


L-tyrosine 


4 


g 


uracil 


6 


g 


L-valine 



Mix all the ingredients and grind with 
20 a mortar and pestle until the mixture is finely 

ground* 

-LeuTrpThrD 
20 g glucose 
25 6.7 g Yeast Nitrogen Base without amino 

acids (DIFCO Laboratories, Detroit, 

MI) 

0.6 g -LeuThrTrp Amino Acid Mixture 

18 g Agar 



Mix all the ingredients in distilled 
water. Add distilled water to a final volume of 
1 liter. Autoclave 15 minutes. Pour plates and 
allow to solidify. 
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Table 1 continued 
YEPD ± Ade ± Leu Plates 
20 g glucose 

20 g Bacto Peptone (DIFCO Laboratories) 

10 g Bacto Yeast Extract (DIFCO 

Laboratories) 
18 g agar 
4 ml 1% adenine 

8 ml l% L-leucine 

Mix all ingredients in distilled 
water, and bring to a final volume of l liter. 
Autoclave 2 5 minutes and pour plates. 

15 YEPD + Ade + Leu Medium 

20 g glucose 

20 g Bacto Peptone (DIFCO Laboratories) 

10 g Bacto Yeast Extract (DIFCO 

Laboratories) 
20 4 ml 1% adenine 

8 ml l% L-leucine 

Mix all ingredients in distilled 
water, and bring to a final volume of 1 liter. 
25 Autoclave 25 minutes. 
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Table 1 continued 



A. A. 


. I. 




4 . 0 


g 


adenine 


5.0 


9 


L-alanine 


2 . 0 


g 


L-arginine 


5.0 


g 


L-asparagine 


5.0 


g 


L-aspartic acid 


5.0 


g 


L-cysteine 


5.0 


g 


L-glutamine 


5.0 


g 


L-glutamic acid 


5 . 0 


g 


L-glycine 


8.0 


g 


L-histidine 


5. 0 


g 


L-isoleucine 


3 . 0 


g 


L-ly sine-mono hydrochloride 


2.0 


g 


L-methionine 


5.0 


g 


L-phenylalanine 


5.0 


g 


L-proline 


5.0 


g 


L-serine 


5. 0 


g 


L- threonine 


2. 0 


g 


L- tryptophan 


3 . 0 


g 


L-tyrosine 


3 . 0 


g 


uracil 


5.0 


g 


L~valine 



25 



Mix all the ingredients and grind with 
a mortar and pestle until the mixture is finely 
ground. Store at room temperature. 
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Table 1 continued 
Trace Metal Solution 
0.68 g ZnCl 2 
5.4 g FeCl 3 '6H 2 0 
5 1.91 g MnCl 2 *4H 2 0 

0.22 g CuS0 4 -5H 2 0 
0.258 g CoCl 2 
0.062 g H3BO3 
0.002 g (NH 4 ) 6 Mo 2 0 2 
10 0.002 g KI 

10.0 ml 37% HC1 

Dissolve solids in water and bring to 
a final volume of 1 liter. 

15 



Vitamin Solution 

25 mg d-biotin 

400 mg thiamine 

4 00 mg pyridoxine 

20 7.5 g meso-inositol 

7 . 5 g ca pantothenate 

300 mg niacinamide 

50 mg folic acid 

100 mg riboflavin 

25 500 mg choline 

Dissolve solids in water and bring to 
a final volume of 1 liter. 



A 60 liter fermenter with a final volume of 50 
liters of medium containing 60 g/L yeast extract 
(Universal Foods, Milwaukee, WI) , 2.5 g/L MgS0 4 '7H 2 0 
(Mallinkrodt Inc., St. Louis, MO), 1 g/L CaCl 2 '2H 2 0 
(Mallinkrodt, Inc.), 1 g/L KC1 (Mallinkrodt, Inc.), 10 
ml/L of Trace Metal Solution (Table 1), 0.5 ml/L PPG-2025 
(Union Carbide) that had been adjusted to a pH of 5.0 with 
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H3PO4 was prepared, and the medium was sterilized. After 
sterilization, 5.0 liters of the 23 hour fermentation 
culture and 500 ml of Vitamin Solution (Table 1) were 
inoculated into the medium- During the fermentation, a 
5 solution of 50% glucose, 5% (NH 4 ) 2 S0 4/ 0.05% citric acid 
was fed into the fermenter at a rate of 150 ml/hour, and 
the pH was maintained at approximately pH 5 by the 
addition of 2 M NH 4 OH. PPG-2025 was added as needed to 
control foaming. At approximately 49 hours post 

10 inoculation, an ethanol feed was begun by the addition of 
ethanol to the fermenter at a rate of 150 ml/min. The 
culture was grown for a total of 67.25 hours at 30 °C. 

At the end of the fermentation, 50 liters of the 
culture was diluted to 100 liters with water. The cells 

15 were removed from the spent medium by centrifuging 50 
liters at a time through a Westfalia CSA 19 centrifuge 
(Westfalia, Oelde, Germany) at a flow rate of 4 
liters/min. The cells were rinsed with water. From the 
centrifugation, approximately 20 liters of cell slurry 

20 containing approximately 35% cells was obtained. Salts 
were added to the slurry to achieve a final concentration 
of the following salts: 50 mM NaCl, 10 mM Na 2 HP0 4 , 5 mM 
EDTA. The cell slurry was passed through a Dynomill bead 
mill using 0.5 mm lead-free glass beads (Willy A Bachofen 

25 AG MashinenFabrik, Basle, Switzerland) at a rate of 4 
liters per minute. The Dynomill was rinsed with Lysis 
buffer (50 mM NaCl, 10 mM Na 2 HP0 4 , 5 mM EDTA, pH 7.2) to a 
final volume of 80 liters. The final slurry had a pH of 
6.8, a temperature of approximately 10 °C and a 

3 0 conductivity of 5 ms/cm. 

The cell slurry was subjected to centrifugation 
as described above, and the cell pellet was rinsed with 
lysis buffer. After centrifugation approximately 20 

liters of cell slurry was obtained. The cell slurry was 

35 extracted by first adjusting the concentration of the cell 
debris to approximately 40-50% with lysis buffer. Solid 
urea, NaCl and EDTA were added to the cell slurry to 



WO 94/16085 PCT/US93/12687 



30 



achieve a final concentration of approximately 8 M urea, 
0.3 M NaCl and 10 mM EDTA. The approximate salt 
concentrations were obtained by the addition of 4 50 g/L of 
urea, 18 g/L of NaCl and 4.2 g/L of EDTA. The cell slurry 
5 was adjusted to pH 7.8 with 0.5 M NaOH. The solids were 
dissolved into the slurry and the pellets were extracted 
for a total of 50 minutes. Following extraction, the 
mixture was diluted 1 to 4 with water, adjusted to a 
conductivity of 12.5 ms/cm with NaCl and adjusted to a pH 

10 of 9.5 with 0.5 M NaOH. 

The extracted slurry was centrifuged as 
described above with the lysis buffer rinse. The pH of 
the supernatant was adjusted to pH 9.5 with 0.5 M NaOH. 
The supernatant was analyzed by SDS polyacrylamide gel 

15 electrophoresis (SDS-PAGE) using the PHAST System 
Separation and Control Unit (Pharmacia LKB Biotechnology 
Inc., Piscataway, NJ) , and the protein was visualized 
using Coomassie Blue staining. A 2 liter Q-sepharose 

column (Pharmacia) was equilibrated at 5 liters/hour with 

20 successive washes of the following solutions: 8 liters of 
3 M urea, 1 M NaCl, 50 mM glycine, pH 11.5; 5 liters of 
0.5 M NaOH; 1.5 liters of water; 5 liters of 0.1 M HCl; 
and 6.0 liters of Wash buffer (50 mM glycine, 9 0 mM NaCl, 
pH 9.5 with a conductivity of 12.5 ms/cm). The 

25 supernatant (110 liters) was then applied to the column at 
5 liters per hour. 

The column ran dry after loading the 
supernatant. The gel was resuspended in Wash buffer and 
repacked. The repacked column was washed with 4 liters of 

30 50 mM glycine, 90 mM NaCl, 5 mM EDTA, pH 10.0. The 

material was eluted with elution buffer (50 mM glycine, 5 
mM EDTA (pH 9.9) with a final concentration of NaCl giving 
a conductivity of 30.2 cm/ms (approximately 270 mM NaCl)) 
at 100 ml per minute. The approximately 600 ml fractions 

35 were collected after the conductivity of the eluant 
reached the conductivity of the elution buffer. Fractions 
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were analyzed by SDS-PAGE analysis as described above and 
fractions 1 through 10 were pooled. 

The pooled fractions were then applied to a 2 
liter phenyl Sepharose column (Pharmacia) that had been 
5 equilibrated by successive washes at 5 liters per hour 
with the following solutions: 3 liters of 0.5 M NaOH; 3 
liters of water; 3 liters of 2 M urea, 50 mM glycine, pH 
10.5; 1.5 liters of water; 3 liters of 0.1 M HC1; and 3 
liters of Equilibration buffer (50 mM glycine, 2.5 M NaCl, 

10 2 mM EDTA (pH 10.0) with a conductivity of 180 ms/cm). 

The pooled peak fractions, which had been adjusted to a 
conductivity of 180 ms/cm with NaCl and a pH of 10.0 with 
0.5 M NaOH, were loaded onto the phenyl sepharose column. 
Following the loading of the peak fractions, the column 

15 was washed with Equilibration buffer. The column was 
eluted with 6 liters of 50 mM glycine, 2 mM EDTA (pH 
10.25) with a NaCl concentration giving the solution a 
conductivity of 96 ms/cm. The conductivity of the eluant 
was measured throughout the elution. The conductivity of 

20 the eluant upon starting the elution was 180 ms/cm. In 
the third fraction, the conductivity of the eluant dropped 
to 96 ms/cm. At this point, the elution buffer was 
changed to a buffer having the conductivity of 4 2 ms/cm. 
The eluant was collected through fraction number 8 . 
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Example 6 - Cross-Linking Assay Using the Hybrid 
Fibrinogen-Fibronectin Protein 
The ability of the purified fibrinogen- 
fibronectin hybrid protein to form transglutminase- 
5 catalyzed interchain cross links was assessed. The 
transglutaminase activity was provided by the addition of 
recombinant factor XIII and thrombin or by the addition of 
recombinant factor Xllla. 

10 A. Preparation of Factor XIII 

Recombinant factor XIII was prepared essentially 
as described in co-pending U.S. Patent Application No. 
07/927,196, which is incorporated by reference herein in 
its entirety. Briefly, factor XIII was isolated from a 

15 strain of the yeast Saccharomvces cerevisiae that had been 
transformed with an expression vector capable of directing 
the expression of factor XIII. The factor XHI-producing 
cells were harvested and lysed, and a cleared lysate was 
prepared. The lysate was fractionated by anion exchange 

2 0 chromatography at neutral to slightly alkaline pH using a 
column of derivatized agarose, such as DEAE FAST-FLOW 
SEPHAROSE (Pharmacia LKB Biotechnology, Piscataway, NJ) or 
the like. Factor XIII was then precipitated from the 
column eluate by concentrating the eluate and adjusting 

25 the pH to between 5.2 and 5.5, such as by diaf iltration 
against ammonium succinate buffer. The precipitate was 
then dissolved and further purified using conventional 
chromatographic techniques, such as gel filtration and 
hydrophobic interaction chromatography. The purified 

30 factor XIII was dialyzed, filtered, aliquotted and 

lyophilized. The factor Xllla content was determined 
(Bishop et al . , Biochemistry 29 : 1861-1869, 1990, which is 
incorporated by reference herein in its entirety) by 
fluorometric assay of the dissolved, thrombin-activated 

35 material. 

Factor XIII was activated to factor Xllla by 
adding 2 U of thrombin per 100 mg of factor XIII. The 
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factor XIII was dissolved in buffer (20 mM sodium borate 
(pH 8.3), 1 mM CaCl 2 ) . The thrombin was added, and the 
reaction was incubated at room temperature for twenty 
minutes . 

5 

B. Cross-Linking Assays 

The level of cross-linking between the hybrid 
proteins was measured as a rise in the absorbance at 350 
nm over time in reaction mixtures containing the hybrid 

10 protein, factor XIII and thrombin or the hybrid protein 
and factor Xllla. Control reactions were prepared 

containing factor XIII and thrombin or factor XHIa alone. 
Cross-linking reactions were carried out in 1 ml cuvettes. 
For cross-linking reactions containing factor XIII and 

15 thrombin, each reaction mixture was set up by placing no 
fil containing 40 Units of factor XIII, 36.7 jil containing 
13 Units of factor XIII or 12.2 fil containing 4 Units of 
factor XIII (described above) in one corner of the cuvette 
and 20 /il containing 4 Units of thrombin (Sigma) in the 

20 opposite corner such that the solutions were not mixed. 

The reaction was initiated by the addition of 1 ml of 2 
mg/ml hybrid protein in buffer (10 mM Tris (pH 7.6), 20 mM 
sodium borate, 140 mM NaCl, 10 mM CaCl 2 ) . The absorbance 
of each reaction was read at 350 nm with the addition of 

25 protein being the first absorbance point. For cross- 
linking reactions containing factor Xllla, each reaction 
was set up by placing 110 /il containing 40 Units of factor 
Xllla, 36.7 fil containing 13 Units of factor Xllla or 12.2 
fil containing 4 Units of factor Xllla in the cuvette and 

30 adding 1 ml of 2 mg/ml hybrid in buffer (10 mM Tris (pH 
7.6), 140 mM NaCl, 10 mM CaCl 2 ). The absorbance of the 
solution was read at 3 50 nm as described above. Analysis 
of the data generated from the absorbance time courses 
showed a sharp increase in absorbance in the presence of 

35 hybrid protein and the active transglutaminase relative to 
the rise in absorbance in the absence of hybrid protein 
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(Figures 2-5) . The results indicated that the hybrid 
protein is capable of transglutaminase-induced cross- 
linking. 

From the foregoing it will be appreciated that, 
5 although specific embodiments of the invention have been 
described herein for the purpose of illustration, various 
modifications may be made without deviation from the 
spirit and scope of the invention. Accordingly, the 
invention is not to be limited except as by the following 
10 claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Irani , Meher H. 
(ii) TITLE OF INVENTION: HYBRID CROSS-LINKING PROTEINS 
(iii) NUMBER OF SEQUENCES: 14 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ZymoGenetics, Inc. 

(B) STREET: 4225 Roosevelt Way, N.E. 

(C) CITY: Seattle 

(D) STATE: WA 

(E) COUNTRY: USA 

(F) ZIP: 98105 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: WO 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/998,271 

(B) FILING DATE: 31-DEC-1992 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Parker, Gary E 

(B) REGISTRATION NUMBER: 31-648 

(C) REFERENCE/DOCKET NUMBER: 92-26PC 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 206-547-8080 ext 322 

(B) TELEFAX: 206-548-2329 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7803 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

TCAAC ATG CTT AGG GGT CCG GGG CCC GGG CTG CTG CTG CTG GCC GTC 47 
Met Leu Arg Gly Pro Gly Pro Gly Leu Leu Leu Leu Ala Val 
1 5 10 

CTG TGC CTG GGG ACA GCG GTG CCC TCC ACG GGA GCC TCG AAG AGC AAG 95 
Leu Cys Leu Gly Thr Ala Val Pro Ser Thr Gly Ala Ser Lys Ser Lys 
15 20 25 30 

AGG CAG GCT CAG CAA ATG GTT CAG CCC CAG TCC CCG GTG GCT GTC AGT 143 
Arg Gin Ala Gin Gin Met Val Gin Pro Gin Ser Pro Val Ala Val Ser 
35 40 45 

CAA AGC AAG CCC GGT TGT TAT GAC AAT GGA AAA CAC TAT CAG ATA AAT 191 
Gin Ser Lys Pro Gly Cys Tyr Asp Asn Gly Lys His Tyr Gin lie Asn 
50 55 60 

CAA CAG TGG GAG CGG ACC TAC CTA GGT AAT GTG TTG GTT TGT ACT TGT 239 
Gin Gin Trp Gl u Arg Thr Tyr Leu Gly Asn Val Leu Val Cys Thr Cys 
65 70 75 

TAT GGA GGA AGC CGA GGT TTT AAC TGC GAA AGT AAA CCT GAA GCT GAA 287 
Tyr Gly Gly Ser Arg Gly Phe Asn Cys Gl u Ser Lys Pro Glu Ala Glu 
80 85 90 

GAG ACT TGC TTT GAC AAG TAC ACT GGG AAC ACT TAC CGA GTG GGT GAC 335 
Glu Thr Cys Phe Asp Lys Tyr Thr Gly Asn Thr Tyr Arg Val Gly Asp 
95 100 105 110 

ACT TAT GAG CGT CCT AAA GAC TCC ATG ATC TGG GAC TGT ACC TGC ATC 383 
Thr Tyr Glu Arg Pro Lys Asp Ser Met He Trp Asp Cys Thr Cys lie 
115 120 125 

GGG GCT GGG CGA GGG AGA ATA AGC TGT ACC ATC GCA AAC CGC TGC CAT 431 
Gly Ala Gly Arg Gly Arg He Ser Cys Thr He Ala Asn Arg Cys His 
130 135 140 

GAA GGG GGT CAG TCC TAC AAG ATT GGT GAC ACC TGG AGG AGA CCA CAT 479 
Glu Gly Gly Gin Ser Tyr Lys He Gly Asp Thr Trp Arg Arg Pro His 
145 150 155 

GAG ACT GGT GGT TAC ATG TTA GAG TGT GTG TGT CTT GGT AAT GGA AAA 527 
Glu Thr Gly Gly Tyr Met Leu Glu Cys Val Cys Leu Gly Asn Gly Lys 
160 165 170 

GGA GAA TGG ACC TGC AAG CCC ATA GCT GAG AAG TGT TTT GAT CAT GCT 575 
Gly Glu Trp Thr Cys Lys Pro He Ala Glu Lys Cys Phe Asp His Ala 
175 180 185 190 

GCT GGG ACT TCC TAT GTG GTC GGA GAA ACG TGG GAG AAG CCC TAC CAA 623 
Ala Gly Thr Ser Tyr Val Val Gly Glu Thr Trp Glu Lys Pro Tyr Gin 
195 200 205 
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GGC TGG ATG ATG GTA GAT TGT ACT TGC CTG GGA GAA GGC AGC GGA CGC 671 
Gly Trp Met Met Val Asp Cys Thr Cys Leu Gly Glu Gly Ser Gly Arg 
210 215 220 

ATC ACT TGC ACT TCT AGA AAT AGA TGC AAC GAT CAG GAC ACA AGG ACA 719 
He Thr Cys Thr Ser Arg Asn Arg Cys Asn Asp Gin Asp Thr Arg Thr 
225 230 235 

TCC TAT AGA ATT GGA GAC ACC TGG AGC AAG AAG GAT AAT CGA GGA AAC 767 
Ser Tyr Arg He Gly Asp Thr Trp Ser Lys Lys Asp Asn Arg Gly Asn 
240 245 250 

CTG CTC CAG TGC ATC TGC ACA GGC AAC GGC CGA GGA GAG TGG AAG TGT 815 
Leu Leu Gin Cys lie Cys Thr Gly Asn Gly Arg Gly Glu Trp Lys Cys 
255 260 265 270 

GAG AGG CAC ACC TCT GTG CAG ACC ACA TCG AGC GGA TCT GGC CCC TTC 863 
Glu Arg His Thr Ser Val Gin Thr Thr Ser Ser Gly Ser Gly Pro Phe 
275 280 285 

ACC GAT GTT CGT GCA GCT GTT TAC CAA CCG CAG CCT CAC CCC CAG CCT 911 
Thr Asp Val Arg Ala Ala Val Tyr Gin Pro Gin Pro His Pro Gin Pro 
290 295 300 

CCT CCC TAT GGC CAC TGT GTC ACA GAC AGT GGT GTG GTC TAC TCT GTG 959 
Pro Pro Tyr Gly His Cys Val Thr Asp Ser Gly Val Val Tyr Ser Val 
305 310 315 

GGG ATG CAG TGG TTG AAG ACA CAA GGA AAT AAG CAA ATG CTT TGC ACG 1007 
Gly Met Gin Trp Leu Lys Thr Gin Gly Asn Lys Gin Met Leu Cys Thr 
320 325 330 

TGC CTG GGC AAC GGA GTC AGC TGC CAA GAG ACA GCT GTA ACC CAG ACT 1055 
Cys Leu Gly Asn Gly Val Ser Cys Gin Glu Thr Ala Val Thr Gin Thr 
335 340 345 350 

TAC GGT GGC AAC TTA AAT GGA GAG CCA TGT GTC TTA CCA TTC ACC TAC 1103 
Tyr Gly Gly Asn Leu Asn Gly Glu Pro Cys Val Leu Pro Phe Thr Tyr 
355 360 365 

AAT GGC AGG ACG TTC TAC TCC TGC ACC ACG GAA GGG CGA CAG GAC GGA 1151 
Asn Gly Arg Thr Phe Tyr Ser Cys Thr Thr Glu Gly Arg Gin Asp Gly 
370 375 380 

CAT CTT TGG TGC AGC ACA ACT TCG AAT TAT GAG CAG GAC CAG AAA TAC 1199 
His Leu Trp Cys Ser Thr Thr Ser Asn Tyr Glu Gin Asp Gin Lys Tyr 
385 390 395 

TCT TTC. TGC ACA GAC CAC ACT GTT TTG GTT CAG ACT CAA GGA GGA AAT 1247 
Ser Phe Cys Thr Asp His Thr Val Leu Val Gin Thr Gin Gly Gly Asn 
400 405 410 

TCC AAT GGT GCC TTG TGC CAC TTC CCC TTC CTA TAC AAC AAC CAC AAT 1295 
Ser Asn Gly Ala Leu Cys His Phe Pro Phe Leu Tyr Asn Asn His Asn 
415 420 425 430 
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TAC ACT GAT TGC ACT TCT GAG GGC AGA AGA GAC AAC ATG AAG TGG TGT 1343 
Tyr Thr Asp Cys Thr Ser Glu Gly Arg Arg Asp Asn Met Lys Trp Cys 
435 440 445 

GGG ACC ACA CAG AAC TAT GAT GCC GAC CAG AAG TTT GGG TTC TGC CCC 1391 
Gly Thr Thr Gin Asn Tyr Asp Ala Asp Gin Lys Phe Gly Phe Cys Pro 
450 455 460 

ATG GCT GCC CAC GAG GAA ATC TGC ACA ACC AAT GAA GGG GTC ATG TAC 1439 
Met Ala Ala His Glu Glu He Cys Thr Thr Asn Glu Gly Val Met Tyr 
465 470 475 

CGC ATT GGA GAT CAG TGG GAT AAG CAG CAT GAC ATG GGT CAC ATG ATG 1487 
Arg lie Gly Asp Gin Trp Asp Lys Gin His Asp Met Gly His Met Met 
480 485 490 

AGG TGC ACG TGT GTT GGG AAT GGT CGT GGG GAA TGG ACA TGC ATT GCC 1535 
Arg Cys Thr Cys Val Gly Asn Gly Arg Gly Glu Trp Thr Cys lie Ala 
495 500 505 510 

TAC TCG CAA CTT CGA GAT CAG TGC ATT GTT GAT GAC ATC ACT TAC AAT 1583 
Tyr Ser Gin Leu Arg Asp Gin Cys lie Val Asp Asp He Thr Tyr Asn 
515 520 525 

GTG AAC GAC ACA TTC CAC AAG CGT CAT GAA GAG GGG CAC ATG CTG AAC 1631 
Val Asn Asp Thr Phe His Lys Arg His Glu Glu Gly His Met Leu Asn 
530 535 540 

TGT ACA TGC TTC GGT CAG GGT CGG GGC AGG TGG AAG TGT GAT CCC GTC 1679 
Cys Thr Cys Phe Gly Gin Gly Arg Gly Arg Trp Lys Cys Asp Pro Val 
545 550 555 

GAC CAA TGC CAG GAT TCA GAG ACT GGG ACG TTT TAT CAA ATT GGA GAT 1727 
Asp Gin Cys Gin Asp Ser Glu Thr Gly Thr Phe Tyr Gin He Gly Asp 
560 565 570 

TCA TGG GAG AAG TAT GTG CAT GGT GTC AGA TAC CAG TGC TAC TGC TAT 1775 
Ser Trp Glu Lys Tyr Val His Gly Val Arg Tyr Gin Cys Tyr Cys Tyr 
575 580 585 590 

GGC CGT GGC ATT GGG GAG TGG CAT TGC CAA CCT TTA CAG ACC TAT CCA 1823 
Gly Arg Gly He Gly Glu Trp His Cys Gin Pro Leu Gin Thr Tyr Pro 
595 600 605 

AGC TCA AGT GGT CCT GTC GAA GTA TTT ATC ACT GAG ACT CCG AGT CAG 1871 
Ser Ser Ser Gly Pro Val Glu Val Phe He Thr Glu Thr Pro Ser Gin 
510 615 620 

CCC AAC TCC CAC CCC ATC CAG TGG AAT GCA CCA CAG CCA TCT CAC ATT 1919 
Pro Asn Ser His Pro He Gin Trp Asn Ala Pro Gin Pro Ser His He 
625 630 635 
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TCC AAG TAC ATT CTC AGG TGG AGA CCT AAA AAT TCT GTA GGC CGT TGG 1967 
Ser Lys Tyr He Leu Arg Trp Arg Pro Lys Asn Ser Val Gly Arg Trp 
640 645 * 650 

AAG GAA GCT ACC ATA CCA GGC CAC TTA AAC TCC TAC ACC ATC AAA GGC 2015 
Lys Glu Ala Thr He Pro Gly His Leu Asn Ser Tyr Thr lie Lys Gly 
655 660 665 670 

CTG AAG CCT GGT GTG GTA TAC GAG GGC CAG CTC ATC AGC ATC CAG CAG 2063 
Leu Lys Pro Gly Val Val Tyr Glu Gly Gin Leu He Ser lie Gin Gin 
675 680 685 

TAC GGC CAC CAA GAA GTG ACT CGC TTT GAC TTC ACC ACC ACC AGC ACC 2111 
Tyr Gly His Gin Glu Val Thr Arg Phe Asp Phe Thr Thr Thr Ser Thr 
690 695 700 

AGC ACA CCT GTG ACC AGC AAC ACC GTG ACA GGA GAG ACG ACT CCC TTT 2159 
Ser Thr Pro Val Thr Ser Asn Thr Val Thr Gly Glu Thr Thr Pro Phe 
705 710 715 

TCT CCT CTT GTG GCC ACT TCT GAA TCT GTG ACC GAA ATC ACA GCC AGT 2207 
Ser Pro Leu Val Ala Thr Ser Glu Ser Val Thr Glu He Thr Ala Ser 
720 725 730 

AGC TTT GTG GTC TCC TGG GTC TCA GCT TCC GAC ACC GTG TCG GGA TTC 2255 
Ser Phe Val Val Ser Trp Val Ser Ala Ser Asp Thr Val Ser Gly Phe 
735 740 745 750 

CGG GTG GAA TAT GAG CTG AGT GAG GAG GGA GAT GAG CCA CAG TAC CTG 2303 
Arg Val Glu Tyr Glu Leu Ser Glu Glu Gly Asp Glu Pro Gin Tyr Leu 
755 760 765 

GAT CTT CCA AGC ACA GCC ACT TCT GTG AAC ATC CCT GAC CTG CTT CCT 2351 
Asp Leu Pro Ser Thr Ala Thr Ser Val Asn He Pro Asp Leu Leu Pro 
770 775 780 

GGC CGA AAA TAC ATT GTA AAT GTC TAT CAG ATA TCT GAG GAT GGG GAG 2399 
Gly Arg Lys Tyr He Val Asn Val Tyr Gin He Ser Glu Asp Gly Glu 
785 790 795 

CAG AGT TTG ATC CTG TCT ACT TCA CAA ACA ACA GCG CCT GAT GCC CCT 2447 
Gin Ser Leu He Leu Ser Thr Ser Gin Thr Thr Ala Pro Asp Ala Pro 
800 805 810 

CCT GAC CCG ACT GTG GAC CAA GTT GAT GAC ACC TCA ATT GTT GTT CGC 2495 
Pro Asp Pro Thr Val Asp Gin Val Asp Asp Thr Ser He Val Val Arq 
815 820 825 830 

TGG AGC AGA CCC CAG GCT CCC ATC ACA GGG TAC AGA ATA GTC TAT TCG 2543 
Trp Ser Arg Pro Gin Ala Pro He Thr Gly Tyr Arg He Val Tyr Ser 
835 840 845 

CCA TCA GTA GAA GGT AGC AGC ACA GAA CTC AAC CTT CCT GAA ACT GCA 2591 
Pro Ser Val Glu Gly Ser Ser Thr Glu Leu Asn Leu Pro Glu Thr Ala 
850 855 860 
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AAC TCC GTC ACC CTC AGT GAC TTG CAA CCT GGT GTT CAG TAT AAC ATC 2639 
Asn Ser Val Thr Leu Ser Asp Leu Gin Pro Gly Val Gin Tyr Asn He 
865 870 875 

ACT ATC TAT GCT GTG GAA GAA AAT CAA GAA AGT ACA CCT GTT GTC ATT 2687 
Thr He Tyr Ala Val Glu Glu Asn Gin Glu Ser Thr Pro Val Val He 
880 885 890 

CAA CAA GAA ACC ACT GGC ACC CCA CGC TCA GAT ACA GTG CCC TCT CCC 2735 
Gin Gin Glu Thr Thr Gly Thr Pro Arg Ser Asp Thr Val Pro Ser Pro 
895 900 905 910 

AGG GAC CTG CAG TTT GTG GAA GTG ACA GAC GTG AAG GTC ACC ATC ATG 2783 
Arg Asp Leu Gin Phe Val Glu Val Thr Asp Val Lys Val Thr He Met 
915 920 925 

TGG ACA CCG CCT GAG AGT GCA GTG ACC GGC TAC CGT GTG GAT GTG ATC 2831 
Trp Thr Pro Pro Glu Ser Ala Val Thr Gly Tyr Arg Val Asp Val He 
930 935 940 

CCC GTC AAC CTG CCT GGC GAG CAC GGG CAG AGG CTG CCC ATC AGC AGG 2879 
Pro Val Asn Leu Pro Gly Glu His Gly Gin Arg Leu Pro lie Ser Arg 
945 950 ~ 955 

AAC ACC TTT GCA GAA GTC ACC GGG CTG TCC CCT GGG GTC ACC TAT TAC 2927 
Asn Thr Phe Ala Glu Val Thr Gly Leu Ser Pro Gly Val Thr Tyr Tyr 
960 965 970 

TTC AAA GTC TTT GCA GTG AGC CAT GGG AGG GAG AGC AAG CCT CTG ACT 2975 
Phe Lys Val Phe Ala Val Ser His Gly Arg Glu Ser Lys Pro Leu Thr 
975 980 985 990 

GCT CAA CAG ACA ACC AAA CTG GAT GCT CCC ACT AAC CTC CAG TTT GTC 3023 
Ala Gin Gin Thr Thr Lys Leu Asp Ala Pro Thr Asn Leu Gin Phe Val 
995 1000 1005 

AAT GAA ACT GAT TCT ACT GTC CTG GTG AGA TGG ACT CCA CCT CGG GCC 3071 
Asn Glu Thr Asp Ser Thr Val Leu Val Arg Trp Thr Pro Pro Arg Ala 
1010 1015 1020 

CAG ATA ACA GGA TAC CGA CTG ACC GTG GGC CTT ACC CGA AGA GGC CAG 3119 
Gin He Thr Gly Tyr Arg Leu Thr Val Gly Leu Thr Arg Arg Gly Gin 
1025 1030 1035 

CCC AGG CAG TAC AAT GTG GGT CCC TCT GTC TCC AAG TAC CCC CTG AGG 3167 
Pro Arg Gin Tyr Asn Val Gly Pro Ser Val Ser Lys Tyr Pro Leu Arg 
1040 1045 1050 

AAT CTG CAG CCT GCA TCT GAG TAC ACC GTA TCC CTC GTG GCC ATA AAG 3215 
Asn Leu Gin Pro Ala Ser Glu Tyr Thr Val Ser Leu Val Ala He Lys 
1055 1060 1065 • 1070 
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GGC AAC CAA GAG AGC CCC AAA GCC ACT GGA GTC TTT ACC ACA CTG CAG 3263 
Gly Asn Gin Glu Ser Pro Lys Ala Thr Gly Val Phe Thr Thr Leu Gin 
1075 1080 1085 

CCT GGG AGC TCT ATT CCA CCT TAC AAC ACC GAG GTG ACT GAG ACC ACC 3311 
Pro Gly Ser Ser lie Pro Pro Tyr Asn Thr Glu Val Thr Glu Thr Thr 
1090 1095 HOO 

ATC GTG ATC ACA TGG ACG CCT GCT CCA AGA ATT GGT TTT AAG CTG GGT 3359 
lie Val He Thr Trp Thr Pro Ala Pro Arg lie Gly Phe Lys Leu Gly 
1105 1110 1115 

GTA CGA CCA AGC CAG GGA GGA GAG GCA CCA CGA GAA GTG ACT TCA GAC 3407 
Val Arg Pro Ser Gin Gly Gly Glu Ala Pro Arg Glu Val Thr Ser Asp 
1120 1125 1130 

TCA GGA AGC ATC GTT GTG TCC GGC TTG ACT CCA GGA GTA GAA TAC GTC 3455 
Ser Gly Ser He Val Val Ser Gly Leu Thr Pro Gly Val Glu Tyr Val 
1135 1140 1145 1150 

TAC ACC ATC CAA GTC CTG AGA GAT GGA CAG GAA AGA GAT GCG CCA ATT 3503 
Tyr Thr He Gin Val Leu Arg Asp Gly Gin Glu Arg Asp Ala Pro He 
1155 1160 1165 

GTA AAC AAA GTG GTG ACA CCA TTG TCT CCA CCA ACA AAC TTG CAT CTG 3551 
Val Asn Lys Val Val Thr Pro Leu Ser Pro Pro Thr Asn Leu His Leu 
1170 1175 1180 

GAG GCA AAC CCT GAC ACT GGA GTG CTC ACA GTC TCC TGG GAG AGG AGC 3599 
Glu Ala Asn Pro Asp Thr Gly Val Leu Thr Val Ser Trp Glu Arg Ser 
1185 1190 1195 

ACC ACC CCA GAC ATT ACT GGT TAT AGA ATT ACC ACA ACC CCT ACA AAC 3647 
Thr Thr Pro Asp He Thr Gly Tyr Arg He Thr Thr Thr Pro Thr Asn 
1200 1205 1210 

GGC CAG CAG GGA AAT TCT TTG GAA GAA GTG GTC CAT GCT GAT CAG AGC 3695 
Gly Gin Gin Gly Asn Ser Leu Glu Glu Val Val His Ala Asp Gin Ser 
1215 1220 1225 1230 

TCC TGC ACT TTT GAT AAC CTG AGT CCC GGC CTG GAG TAC AAT GTC AGT 3743 
Ser Cys Thr Phe Asp Asn Leu Ser Pro Gly Leu Glu Tyr Asn Val Ser 
1235 1240 1245 

GTT TAC ACT GTC AAG GAT GAC AAG GAA AGT GTC CCT ATC TCT GAT ACC 3791 
Val Tyr Thr Val Lys Asp Asp Lys Glu Ser Val Pro He Ser Asp Thr 
1250 1255 1260 

ATC ATC CCA GAG GTG CCC CAA CTC ACT GAC CTA AGC TTT GTT GAT ATA 3839 
He He Pro Glu Val Pro Gin Leu Thr Asp Leu Ser Phe Val Asp He 
1265 1270 1275 

ACC GAT TCA AGC ATC GGC CTG AGG TGG ACC CCG CTA AAC TCT TCC ACC 3887 
Thr Asp Ser Ser He Gly Leu Arg Trp Thr Pro Leu Asn Ser Ser Thr 
1280 1285 1290 
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ATT ATT GGG TAC CGC ATC ACA GTA GTT GCG GCA GGA GAA GGT ATC CCT 3935 

lie He Gly Tyr Arg He Thr Val Val Ala Ala Gly Glu Gly He Pro 

1295 1300 1305 1310 

ATT TTT GAA GAT TTT GTG TAC TCC TCA GTA GGA TAC TAC ACA GTC ACA 3983 

He Phe Glu Asp Phe Val Tyr Ser Ser Val Gly Tyr Tyr Thr Val Thr 
1315 1320 1325 

GGG CTG GAG CCG GGC ATT GAC TAT GAT ATC AGC GTT ATC ACT CTC ATT 4031 

Gly Leu Glu Pro Gly lie Asp Tyr Asp He Ser Val He Thr Leu He 
1330 1335 1340 

AAT GGC GGC GAG AGT GCC CCT ACT ACA CTG ACA CAA CAA ACG GCT GTT 4079 

Asn Gly Gly Glu Ser Ala Pro Thr Thr Leu Thr Gin Gin Thr Ala Val 
1345 1350 1355 



CCT CCT CCC ACT GAC CTG CGA TTC ACC AAC ATT GGT CCA GAC ACC ATG 
Pro Pro Pro Thr Asp Leu Arg Phe Thr Asn He Gly Pro Asp Thr Met 
1360 1355 1370 



4127 



CGT GTC ACC TGG GCT CCA CCC CCA TCC ATT GAT TTA ACC AAC TTC CTG 4175 
Arg Val Thr Trp Ala Pro Pro Pro Ser He Asp Leu Thr Asn Phe Leu 
1375 1380 1385 1390 

GTG CGT TAC TCA CCT GTG AAA AAT GAG GAA GAT GTT GCA GAG TTG TCA 4223 
Val Arg Tyr Ser Pro Val Lys Asn Glu Glu Asp Val Ala Glu Leu Ser 
1395 1400 1405 

ATT TCT CCT TCA GAC AAT GCA GTG GTC TTA ACA AAT CTC CTG CCT GGT 4271 
He Ser Pro Ser Asp Asn Ala Val Val Leu Thr Asn Leu Leu Pro Gly 
1410 1415 1420 



ACA GAA TAT GTA GTG AGT GTC TCC AGT GTC TAC GAA CAA CAT GAG AGC 
Thr Glu Tyr Val Val Ser Val Ser Ser Val Tyr Glu Gin His Glu Ser 
1425 1430 1435 



4319 



ACA CCT CTT AGA GGA AGA CAG AAA ACA GGT CTT GAT TCC CCA ACT GGC 
Thr Pro Leu Arg Gly Arg Gin Lys Thr Gly Leu Asp Ser Pro Thr Gly 
1440 1445 1450 



4367 



ATT GAC TTT TCT GAT ATT ACT GCC AAC TCT TTT ACT GTG CAC TGG ATT 4415 
He Asp Phe Ser Asp He Thr Ala Asn Ser Phe Thr Val His Trp He 
1455 1460 1465 1470 

GCT CCT CGA GCC ACC ATC ACT GGC TAC AGG ATC CGC CAT CAT CCC GAG 4463 
Ala Pro Arg Ala Thr He Thr Gly Tyr Arg He Arg His His Pro Glu 
1475 1480 1485 



CAC TTC AGT GGG AGA CCT CGA GAA GAT CGG GTG CCC CAC TCT CGG AAT 
His Phe Ser Gly Arg Pro Arg Glu Asp Arg Val Pro His Ser Arg Asn 
1490 1495 1500 



4511 
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TCC ATC ACC CTC ACC AAC CTC ACT CCA GGC ACA GAG TAT GTG GTC AGC 4559 
Ser lie Thr Leu Thr Asn Leu Thr Pro Gly Thr Glu Tyr Val Val Ser 
1505 1510 1515 

ATC GTT GCT CTT AAT GGC AGA GAG GAA AGT CCC TTA TTG ATT GGC CAA 4607 
lie Val Ala Leu Asn Gly Arg Glu Glu Ser Pro Leu Leu He Gly Gin 
1520 1525 1530 

CAA TCA ACA GTT TCT GAT GTT CCG AGG GAC CTG GAA GTT GTT GCT GCG 4655 
Gin Ser Thr Val Ser Asp Val Pro Arg Asp Leu Glu Val Val Ala Ala 
1535 1540 1545 1550 

ACC CCC ACC AGC CTA CTG ATC AGC TGG GAT GCT CCT GCT GTC ACA GTG 4703 
Thr Pro Thr Ser Leu Leu He Ser Trp Asp Ala Pro Ala Val Thr Val 
1555 1560 1565 

AGA TAT TAC AGG ATC ACT TAC GGA GAA ACA GGA GGA AAT AGC CCT GTC 4751 
Arg Tyr Tyr Arg lie Thr Tyr Gly Glu Thr Gly Gly Asn Ser Pro Val 
1570 1575 1580 

CAG GAG TTC ACT GTG CCT GGG AGC AAG TCT ACA GCT ACC ATC AGC GGC 4799 
Gin Glu Phe Thr Val Pro Gly Ser Lys Ser Thr Ala Thr He Ser Gly 
1585 1590 1595 

CTT AAA CCT GGA GTT GAT TAT ACC ATC ACT GTG TAT GCT GTC ACT GGC 4847 
Leu Lys Pro Gly Val Asp Tyr Thr He Thr Val Tyr Ala Val Thr Gly 
1600 1505 1610 

CGT GGA GAC AGC CCC GCA AGC AGC AAG CCA ATT TCC ATT AAT TAC CGA 4895 
Arg Gly Asp Ser Pro Ala Ser Ser Lys Pro He Ser He Asn Tyr Arg 
1615 1620 1625 1630 

ACA GAA ATT GAC AAA CCA TCC CAG ATG CAA GTG ACC GAT GTT CAG GAC 4943 
Thr Glu lie Asp Lys Pro Ser Gin Met Gin Val Thr Asp Val Gin Asp 
1635 1640 1645 

AAC AGC ATT AGT GTC AAG TGG CTG CCT TCA AGT TCC CCT GTT ACT GGT 4991 
Asn Ser He Ser Val Lys Trp Leu Pro Ser Ser Ser Pro Val Thr Gly 
1650 1655 1660 

TAC AGA GTA ACC ACC ACT CCC AAA AAT GGA CCA GGA CCA ACA AAA ACT 5039 
Tyr Arg Val Thr Thr Thr Pro Lys Asn Gly Pro Gly Pro Thr Lys Thr 
1665 1670 1675 

AAA ACT GCA GGT CCA GAT CAA ACA GAA ATG ACT ATT GAA GGC TTG CAG 5087 
Lys Thr Ala Gly Pro Asp Gin Thr Glu Met Thr He Glu Gly Leu Gin 
1680 1685 1690 
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AAA GGA CTG GCA TTC ACT GAT GTG GAT GTC GAT TCC ATC AAA ATT GCT 5231 

Lys Gly Leu Ala Phe Thr Asp Val Asp Val Asp Ser He Lys He Ala 
1730 1735 1740 

TGG GAA AGC CCA CAG GGG CAA GTT TCC AGG TAC AGG GTG ACC TAC TCG 5279 

Trp Glu Ser Pro Gin Gly Gin Val Ser Arg Tyr Arg Val Thr Tyr Ser 
1745 1750 1755 



AGC CCT GAG GAT GGA ATC CAT GAG CTA TTC CCT GCA CCT GAT GGT GAA 
Ser Pro Glu Asp Gly He His Glu Leu Phe Pro Ala Pro Asp Gly Glu 
1760 1765 1770 



CAG CTC ACT GGA TAT CGA GTG CGG GTG ACC CCC AAG GAG AAG ACC GGA 
Gin Leu Thr Gly Tyr Arg Val Arg Val Thr Pro Lys Glu Lys Thr Gly 
1840 1845 1850 



5327 



GAA GAC ACT GCA GAG CTG CAA GGC CTC AGA CCG GGT TCT GAG TAC ACA 5375 
Glu Asp Thr Ala Glu Leu Gin Gly Leu Arg Pro Gly Ser Glu Tyr Thr 
1775 1780 1785 1790 

GTC AGT GTG GTT GCC TTG CAC GAT GAT ATG GAG AGC CAG CCC CTG ATT 5423 
Val Ser Val Val Ala Leu His Asp Asp Met Glu Ser Gin Pro Leu He 
1795 1800 1805 

GGA ACC CAG TCC ACA GCT ATT CCT GCA CCA ACT GAC CTG AAG TTC ACT 5471 
Gly Thr Gin Ser Thr Ala He Pro Ala Pro Thr Asp Leu Lys Phe Thr 
1810 1815 1820 

CAG GTC ACA CCC ACA AGC CTG AGC GCC CAG TGG ACA CCA CCC AAT GTT 5519 
Gin Val Thr Pro Thr Ser Leu Ser Ala Gin Trp Thr Pro Pro Asn Val 
1825 1830 1835 



5567 



CCA ATG AAA GAA ATC AAC CTT GCT CCT GAC AGC TCA TCC GTG GTT GTA 5615 
Pro Met Lys Glu He Asn Leu Ala Pro Asp Ser Ser Ser Val Val Val 
1855 I860 1865 1870 

TCA GGA CTT ATG GTG GCC ACC AAA TAT GAA GTG AGT GTC TAT GCT CTT 5663 
Ser Gly Leu Met Val Ala Thr Lys Tyr Glu Val Ser Val Tyr Ala Leu 
1875 1880 1885 

AAG GAC ACT TTG ACA AGC AGA CCA GCT CAG GGT GTT GTC ACC ACT CTG 5711 
Lys Asp Thr Leu Thr Ser Arg Pro Ala Gin Gly Val Val Thr Thr Leu 
1890 1895 1900 

GAG AAT GTC AGC CCA CCA AGA AGG GCT CGT GTG ACA GAT GCT ACT GAG 5759 
Glu Asn Val Ser Pro Pro Arg Arg Ala Arg Val Thr Asp Ala Thr Glu 
1905 1910 1915 

ACC ACC ATC ACC ATT AGC TGG AGA ACC AAG ACT GAG ACG ATC ACT GGC 5807 
Thr Thr He Thr He Ser Trp Arg Thr Lys Thr Glu Thr He Thr Gly 
1920 1925 1930 
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TTC CAA GTT GAT GCC GTT CCA GCC AAT GGC CAG ACT CCA ATC CAG AGA 5855 
Phe Gin Val Asp Ala Val Pro Ala Asn Gly Gin Thr Pro He Gin Arg 
1935 1940 1945 1950 

ACC ATC AAG CCA GAT GTC AGA AGC TAC ACC ATC ACA GGT TTA CAA CCA 5903 
Thr He Lys Pro Asp Val Arg Ser Tyr Thr He Thr Gly Leu Gin Pro 
1955 1960 1965 

GGC ACT GAC TAC AAG ATC TAC CTG TAC ACC TTG AAT GAC AAT GCT CGG 5951 
Gly Thr Asp Tyr Lys He Tyr Leu Tyr Thr Leu Asn Asp Asn Ala Arg 
1970 1975 1980 

AGC TCC CCT GTG GTC ATC GAC GCC TCC ACT GCC ATT GAT GCA CCA TCC 5999 
Ser Ser Pro Val Val He Asp Ala Ser Thr Ala He Asp Ala Pro Ser 
1985 1990 1995 

AAC CTG CGT TTC CTG GCC ACC ACA CCC AAT TCC TTG CTG GTA TCA TGG 6047 
Asn Leu Arg Phe Leu Ala Thr Thr Pro Asn Ser Leu Leu Val Ser Trp 
2000 2005 2010 

CAG CCG CCA CGT GCC AGG ATT ACC GGC TAC ATC ATC AAG TAT GAG AAG 6095 
Gin Pro Pro Arg Ala Arg He Thr Gly Tyr He He Lys Tyr Glu Lys 
2015 2020 2025 2030 

CCT GGG TCT CCT CCC AGA GAA GTG GTC CCT CGG CCC CGC CCT GGT GTC 6143 
Pro Gly Ser Pro Pro Arg Glu Val Val Pro Arg Pro Arg Pro Gly Val 
2035 2040 2045 

ACA GAG GCT ACT ATT ACT GGC CTG GAA CCG GGA ACC GAA TAT ACA ATT 6191 
Thr Glu Ala Thr He Thr Gly Leu Glu Pro Gly Thr Glu Tyr Thr He 
2050 2055 2060 

TAT GTC ATT GCC CTG AAG AAT AAT CAG AAG AGC GAG CCC CTG ATT GGA 6239 
Tyr Val He Ala Leu Lys Asn Asn Gin Lys Ser Glu Pro Leu He Gly 
2065 2070 2075 

AGG AAA AAG ACA GAC GAG CTT CCC CAA CTG GTA ACC CTT CCA CAC CCC 6287 
Arg Lys Lys Thr Asp Glu Leu Pro Gin Leu Val Thr Leu Pro His Pro 
2080 2085 2090 

AAT CTT CAT GGA CCA GAG ATC TTG GAT GTT CCT TCC ACA GTT CAA AAG 6335 
Asn Leu His Gly Pro Glu lie Leu Asp Val Pro Ser Thr Val Gin Lys 
2095 2100 2105 2110 

ACC CCT TTC GTC ACC CAC CCT GGG TAT GAC ACT GGA AAT GGT ATT CAG 6383 
Thr Pro Phe Val Thr His Pro Gly Tyr Asp Thr Gly Asn Gly He Gin 
2115 2120 2125 

CTT CCT GGC ACT TCT GGT CAG CAA CCC AGT GTT GGG CAA CAA ATG ATC 6431 
Leu Pro Gly Thr Ser Gly Gin Gin Pro Ser Val Gly Gin Gin Met He 
2130 2135 2140 

TTT GAG GAA CAT GGT TTT AGG CGG ACC ACA CCG CCC ACA ACG GCC ACC 6479 
Phe Glu Glu His Gly Phe Arg Arg Thr Thr Pro Pro Thr Thr Ala Thr 
2145 2150 2155 
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CCC ATA AGG CAT AGG CCA AGA CCA TAC CCG CCG AAT GTA GGA CAA GAA 6527 
Pro He Arg His Arg Pro Arg Pro Tyr Pro Pro Asn Val Gly Gin Glu 
2160 2165 2170 

GCT CTC TCT CAG ACA ACC ATC TCA TGG GCC CCA TTC CAG GAC ACT TCT 6575 
Ala Leu Ser Gin Thr Thr lie Ser Trp Ala Pro Phe Gin Asp Thr Ser 
2175 2180 2185 2190 

GAG TAC ATC ATT TCA TGT CAT CCT GTT GGC ACT GAT GAA GAA CCC TTA 6623 
Glu Tyr lie He Ser Cys His Pro Val Gly Thr Asp Glu Glu Pro Leu 
2195 2200 2205 

CAG TTC AGG GTT CCT GGA ACT TCT ACC AGT GCC ACT CTG ACA GGC CTC 6671 
Gin Phe Arg Val Pro Gly Thr Ser Thr Ser Ala Thr Leu Thr Gly Leu 
2210 2215 2220 

ACC AGA GGT GCC ACC TAC AAC ATC ATA GTG GAG GCA CTG AAA GAC CAG 6719 
Thr Arg Gly Ala Thr Tyr Asn He lie Val Glu Ala Leu Lys Asp Gin 
2225 2230 2235 

CAG AGG CAT AAG GTT CGG GAA GAG GTT GTT ACC GTG GGC AAC TCT GTC 6767 
Gin Arg His Lys Val Arg Glu Glu Val Val Thr Val Gly Asn Ser Val 
2240 2245 2250 

AAC GAA GGC TTG AAC CAA CCT ACG GAT GAC TCG TGC TTT GAC CCC TAC 6815 
Asn Glu Gly Leu Asn Gin Pro Thr Asp Asp Ser Cys Phe Asp Pro Tyr 
2255 2260 2265 2270 

ACA GTT TCC CAT TAT GCC GTT GGA GAT GAG TGG GAA CGA ATG TCT GAA 6863 
Thr Val Ser His Tyr Ala Val Gly Asp Glu Trp Glu Arg Met Ser Glu 
2275 2280 2285 

TCA GGC TTT AAA CTG TTG TGC CAG TGC TTA GGC TTT GGA AGT GGT CAT 6911 
Ser Gly Phe Lys Leu Leu Cys Gin Cys Leu Gly Phe Gly Ser Gly His 
2290 2295 2300 

TTC AGA TGT GAT TCA TCT AGA TGG TGC CAT GAC AAT GGT GTG AAC TAC 6959 
Phe Arg Cys Asp Ser Ser Arg Trp Cys His Asp Asn Gly Val Asn Tyr 
2305 2310 2315 

AAG ATT GGA GAG AAG TGG GAC CGT CAG GGA GAA AAT GGC CAG ATG ATG 7007 
Lys lie Gly Glu Lys Trp Asp Arg Gin Gly Glu Asn Gly Gin Met Met 
2320 2325 2330 

AGC TGC ACA TGT CTT GGG AAC GGA AAA GGA GAA TTC AAG TGT GAC CCT 7055 
Ser Cys Thr Cys Leu Gly Asn Gly Lys Gly Glu Phe Lys Cys Asp Pro 
2335 2340 2345 2350 

CAT GAG GCA ACG TGT TAC GAT GAT GGG AAG ACA TAC CAC GTA GGA GAA 7103 
His Glu Ala Thr Cys Tyr Asp Asp Gly Lys Thr Tyr His Val Gly Glu 
2355 2360 2365 
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CAG TGG CAG AAG GAA TAT CTC GGT GCC ATT TGC TCC TGC ACA TGC TTT 7151 

Gin Trp Gin Lys Glu Tyr Leu Gly Ala He Cys Ser Cys Thr Cys Phe 
2370 2375 2380 

GGA GGC CAG CGG GGC TGG CGC TGT GAC AAC TGC CGC AGA CCT GGG GGT 7199 

Gly Gly Gin Arg Gly Trp Arg Cys Asp Asn Cys Arg Arg Pro Gly Gly 
2385 2390 2395 

GAA CCC AGT CCC GAA GGC ACT ACT GGC CAG TCC TAC AAC CAG TAT TCT 7247 

Glu Pro Ser Pro Glu Gly Thr Thr Gly Gin Ser Tyr Asn Gin Tyr Ser 
2400 Z405 2410 

CAG AGA TAC CAT CAG AGA ACA AAC ACT AAT GTT AAT TGC CCA ATT GAG 7295 

Gin Arg Tyr His Gin Arg Thr Asn Thr Asn Val Asn Cys Pro He Glu 

2415 2420 2425 2430 



TGC TTC ATG CCT 
Cys Phe Met Pro 



TTA GAT GTA CAG GCT 
Leu Asp Val Gin Ala 
2435 



GAC AGA GAA GAT TCC 
Asp Arg Glu Asp Ser 
2440 



CGA GAG 7343 

Arg Glu 

2445 



TAAATCATCT TTCCAATCCA GAGGAACAAG CATGTCTCTC TGCCAAGATC CATCTAAACT 7403 

GGAGTGATGT TAGCAGACCC AGCTTAGAGT TCTTCTTTCT TTCTTAAGCC CTTTGCTCTG 7463 

GAGGAAGTTC TCCAGCTTCA GCTCAACTCA CAGCTTCTCC AAGCATCACC CTGGGAGTTT 7523 

CCTGAGGGTT TTCTCATAAA TGAGGGCTGC ACATTGCCTG TTCTGCTTCG AAGTATTCAA 7583 

TACCGCTCAG TATTTTAAAT GAAGTGATTC TAAGATTTGG TTTGGGATCA ATAGGAAAGC 7643 

ATATGCAGCC AACCAAGATG CAAATGTTTT GAAATGATAT GACCAAAATT TTAAGTAGGA 7703 

AAGTCACCCA AACACTTCTG CTTTCACTTA AGTGTCTGGC CCGCAATACT GTAGGAACAA 7763 

GCATGATCTT GTTACTGTGA TATTTTAAAT ATCCACAGTA 7803 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2446 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Leu Arg Gly Pro Gly Pro Gly Leu Leu Leu Leu Ala Val Leu Cys 
1.5 10 15 

Leu Gly Thr Ala Val Pro Ser Thr Gly Ala Ser Lys Ser Lys Arg Gin 
20 25 30 
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Ala Gin Gin Met Val Gin Pro Gin Ser Pro Val Ala Val Ser Gin Ser 
35 40 45 

Lys Pro Gly Cys Tyr Asp Asn Gly Lys His Tyr Gin He Asn Gin Gin 
50 55 60 

Trp Glu Arg Thr Tyr Leu Gly Asn Val Leu Val Cys Thr Cys Tyr Gly 
65 70 75 80 

Gly Ser Arg Gly Phe Asn Cys Glu Ser Lys Pro Glu Ala Glu Glu Thr 
85 90 95 

Cys Phe Asp Lys Tyr Thr Gly Asn Thr Tyr Arg Val Gly Asp Thr Tyr 
100 105 110 

Glu Arg Pro Lys Asp Ser Met He Trp Asp Cys Thr Cys He Gly Ala 
115 120 125 

Gly Arg Gly Arg He Ser Cys Thr He Ala Asn Arg Cys His Glu Gly 
130 135 140 

Gly Gin Ser Tyr Lys He Gly Asp Thr Trp Arg Arg Pro His Glu Thr 
145 150 155 160 

Gly Gly Tyr Met Leu Glu Cys Val Cys Leu Gly Asn Gly Lys Gly Glu 
165 170 175 

Trp Thr Cys Lys Pro He Ala Glu Lys Cys Phe Asp His Ala Ala Gly 
180 185 190 

Thr Ser Tyr Val Val Gly Glu Thr Trp Glu Lys Pro Tyr Gin Gly Trp 
195 200 205 

Met Met Val Asp Cys Thr Cys Leu Gly Glu Gly Ser Gly Arg He Thr 
210 215 220 

Cys Thr Ser Arg Asn Arg Cys Asn Asp Gin Asp Thr Arg Thr Ser Tyr 
225 230 235 240 

Arg He Gly Asp Thr Trp Ser Lys Lys Asp Asn Arg Gly Asn Leu Leu 
245 250 255 

Gin Cys He Cys Thr Gly Asn Gly Arg Gly Glu Trp Lys Cys Glu Arg 
260 265 270 

His Thr Ser Val Gin Thr Thr Ser Ser Gly Ser Gly Pro Phe Thr Asp 
275 280 285 

Val Arg Ala Ala Val Tyr Gin Pro Gin Pro His Pro Gin Pro Pro Pro 
290 295 300 

Tyr Gly His Cys Val Thr Asp Ser Gly Val Val Tyr Ser Val Gly Met 
305 310 315 320 
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Gin Trp Leu Lys Thr Gin Gly Asn Lys Gin Met Leu Cys Thr Cys Leu 
325 330 335 

Gly Asn Gly Val Ser Cys Gin Glu Thr Ala Val Thr Gin Thr Tyr Gly 
340 345 350 

Gly Asn Leu Asn Gly Glu Pro Cys Val Leu Pro Phe Thr Tyr Asn Gly 
355 360 365 

Arg Thr Phe Tyr Ser Cys Thr Thr Glu Gly Arg Gin Asp Gly His Leu 
370 375 380 

Trp Cys Ser Thr Thr Ser Asn Tyr Glu Gin Asp Gin Lys Tyr Ser Phe 
385 390 395 400 

Cys Thr Asp His Thr Val Leu Val Gin Thr Gin Gly Gly Asn Ser Asn 
405 410 415 

Gly Ala Leu Cys His Phe Pro Phe Leu Tyr Asn Asn His Asn Tyr Thr 
420 425 430 

Asp Cys Thr Ser Glu Gly Arg Arg Asp Asn Met Lys Trp Cys Gly Thr 
435 440 445 

Thr Gin Asn Tyr Asp Ala Asp Gin Lys Phe Gly Phe Cys Pro Met Ala 
450 455 450 

Ala His Glu Glu He Cys Thr Thr Asn Glu Gly Val Met Tyr Arg He 
465 470 475 480 

Gly Asp Gin Trp Asp Lys Gin His Asp Met Gly His Met Met Arg Cys 
485 490 495 

Thr Cys Val Gly Asn Gly Arg Gly Glu Trp Thr Cys He Ala Tyr Ser 
500 505 510 

Gin Leu Arg Asp Gin Cys He Val Asp Asp He Thr Tyr Asn Val Asn 
515 520 525 

Asp Thr Phe His Lys Arg His Glu Glu Gly His Met Leu Asn Cys Thr 
530 535 540 

Cys Phe Gly Gin Gly Arg Gly Arg Trp Lys Cys Asp Pro Val Asp Gin 
545 550 555 560 

Cys Gin Asp Ser Glu Thr Gly Thr Phe Tyr Gin He Gly Asp Ser Trp 
565 570 575 

Glu Lys Tyr Val His Gly Val Arg Tyr Gin Cys Tyr Cys Tyr Gly Arg 
580 585 590 

Gly He Gly Glu Trp His Cys Gin Pro Leu Gin Thr Tyr Pro Ser Ser 
595 600 605 
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Ser Gly Pro Val Glu Val Phe lie Thr Glu Thr Pro Ser Gin Pro Asn 
610 615 620 

Ser His Pro He Gin Trp Asn Ala Pro Gin Pro Ser His He Ser Lys 
625 630 635 640 

Tyr He Leu Arg Trp Arg Pro Lys Asn Ser Val Gly Arg Trp Lys Glu 
645 650 655 

Ala Thr He Pro Gly His Leu Asn Ser Tyr Thr He Lys Gly Leu Lys 
660 665 670 

Pro Gly Val Val Tyr Glu Gly Gin Leu He Ser He Gin Gin Tyr Gly 
675 680 685 

His Gin Glu Val Thr Arg Phe Asp Phe Thr Thr Thr Ser Thr Ser Thr 
690 695 700 

Pro Val Thr Ser Asn Thr Val Thr Gly Glu Thr Thr Pro Phe Ser Pro 
705 710 715 720 

Leu Val Ala Thr Ser Glu Ser Val Thr Glu He Thr Ala Ser Ser Phe 
725 730 735 

Val Val Ser Trp Val Ser Ala Ser Asp Thr Val Ser Gly Phe Arg Val 
740 745 750 

Glu Tyr Glu Leu Ser Glu Glu Gly Asp Glu Pro Gin Tyr Leu Asp Leu 
755 760 765 

Pro Ser Thr Ala Thr Ser Val Asn He Pro Asp Leu Leu Pro Gly Arg 
770 775 780 

Lys Tyr He Val Asn Val Tyr Gin He Ser Glu Asp Gly Glu Gin Ser 
785 790 795 800 

Leu He Leu Ser Thr Ser Gin Thr Thr Ala Pro Asp Ala Pro Pro Asp 
805 810 815 

Pro Thr Val Asp Gin Val Asp Asp Thr Ser He Val Val Arg Trp Ser 
820 825 830 

Arg Pro Gin Ala Pro He Thr Gly Tyr Arg He Val Tyr Ser Pro Ser 
835 840 845 

Val Glu Gly Ser Ser Thr Glu Leu Asn Leu Pro Glu Thr Ala Asn Ser 
850 855 860 

Val Thr Leu Ser Asp Leu Gin Pro Gly Val Gin Tyr Asn He Thr He 
865 870 875 880 

Tyr Ala Val Glu Glu Asn Gin Glu Ser Thr Pro Val Val He Gin Gin 
885 890 895 
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Glu Thr Thr Gly Thr Pro Arg Ser Asp Thr Val Pro Ser Pro Arg Asp 
900 905 910 

Leu Gin Phe Val Glu Val Thr Asp Val Lys Val Thr He Met Trp Thr 
915 920 * 925 

Pro Pro Glu Ser Ala Val Thr Gly Tyr Arg Val Asp Val He Pro Val 
930 935 940 

Asn Leu Pro Gly Glu His Gly Gin Arg Leu Pro He Ser Arg Asn Thr 
945 950 955 ~ 960 

Phe Ala Glu Val Thr Gly Leu Ser Pro Gly Val Thr Tyr Tyr Phe Lys 
965 970 975 

Val Phe Ala Val Ser His Gly Arg Glu Ser Lys Pro Leu Thr Ala Gin 
980 985 990 

Gin Thr Thr Lys Leu Asp Ala Pro Thr Asn Leu Gin Phe Val Asn Glu 
995 1000 1005 

Thr Asp Ser Thr Val Leu Val Arg Trp Thr Pro Pro Arg Ala Gin He 
1010 1015 1020 

Thr Gly Tyr Arg Leu Thr Val Gly Leu Thr Arg Arg Gly Gin Pro Arg 
1025 1030 1035 1040 

Gin Tyr Asn Val Gly Pro Ser Val Ser Lys Tyr Pro Leu Arg Asn Leu 
1045 1050 1055 

Gin Pro Ala Ser Glu Tyr Thr Val Ser Leu Val Ala He Lys Gly Asn 
1060 1065 1070 

Gin Glu Ser Pro Lys Ala Thr Gly Val Phe Thr Thr Leu Gin Pro Gly 
1075 1080 1085 

Ser Ser He Pro Pro Tyr Asn Thr Glu Val Thr Glu Thr Thr He Val 
1090 1095 1100 

He Thr Trp Thr Pro Ala Pro Arg He Gly Phe Lys Leu Gly Val Arg 
1105 1110 1115 1120 

Pro Ser Gin Gly Gly Glu Ala Pro Arg Glu Val Thr Ser Asp Ser Gly 
1125 1130 1135 

Ser He Val Val Ser Gly Leu Thr Pro Gly Val Glu Tyr Val Tyr Thr 
1140 1145 1150 

He Gin Val Leu Arg Asp Gly Gin Glu Arg Asp Ala Pro He Val Asn 
1155 1160 1165 

Lys Val Val Thr Pro Leu Ser Pro Pro Thr Asn Leu His Leu Glu Ala 
1170 1175 1180 
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Asn Pro Asp Thr Gly Val Leu Thr Val Ser Trp Glu Arg Ser Thr Thr 
H85 1190 1195 ~ 1200 

Pro Asp He Thr Gly Tyr Arg He Thr Thr Thr Pro Thr Asn Gly Gin 
1205 1210 1215 

Gin Gly Asn Ser Leu Glu Glu Val Val His Ala Asp Gin Ser Ser Cys 
1220 1225 1230 

Thr Phe Asp Asn Leu Ser Pro Gly Leu Glu Tyr Asn Val Ser Val Tyr 
1235 1240 " 1245 

Thr Val Lys Asp Asp Lys Glu Ser Val Pro He Ser Asp Thr He He 
1250 1255 1260 

Pro Glu Val Pro Gin Leu Thr Asp Leu Ser Phe Val Asp He Thr Asp 
1265 1270 1275 1280 

Ser Ser He Gly Leu Arg Trp Thr Pro Leu Asn Ser Ser Thr He He 
1285 1290 1295 

Gly Tyr Arg He Thr Val Val Ala Ala Gly Glu Gly He Pro He Phe 
1300 1305 1310 

Glu Asp Phe Val Tyr Ser Ser Val Gly Tyr Tyr Thr Val Thr Gly Leu 
1315 ' 1320 1325 

Glu Pro Gly He Asp Tyr Asp He Ser Val He Thr Leu He Asn Gly 
1330 1335 1340 

Gly Glu Ser Ala Pro Thr Thr Leu Thr Gin Gin Thr Ala Val Pro Pro 
1345 1350 1355 1360 

Pro Thr Asp Leu Arg Phe Thr Asn He Gly Pro Asp Thr Met Arg Val 
1365 1370 1375 

Thr Trp Ala Pro Pro Pro Ser He Asp Leu Thr Asn Phe Leu Val Arg 
1380 1385 1390 

Tyr Ser Pro Val Lys Asn Glu Glu Asp Val Ala Glu Leu Ser He Ser 
1395 1400 1405 

Pro Ser Asp Asn Ala Val Val Leu Thr Asn Leu Leu Pro Gly Thr Glu 
1410 1415 1420 

Tyr Val Val Ser Val Ser Ser Val Tyr Glu Gin His Glu Ser Thr Pro 
1425 1430 1435 1440 

Leu Arg Gly Arg Gin Lys Thr Gly Leu Asp Ser Pro Thr Gly He Asp 
1445 1450 1455 

Phe Ser Asp He Thr Ala Asn Ser Phe Thr Val His Trp He Ala Pro 
1460 1465 1470 
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Arg Ala Thr He Thr Gly Tyr Arg He Arg His His Pro Glu His Phe 
1475 1480 1485 

Ser Gly Arg Pro Arg Glu Asp Arg Val Pro His Ser Arg Asn Ser He 
1490 1495 1500 

Thr Leu Thr Asn Leu Thr Pro Gly Thr Glu Tyr Val Val Ser He Val 
1505 1510 1515 1520 

Ala Leu Asn Gly Arg Glu Glu Ser Pro Leu Leu He Gly Gin Gin Ser 
1525 1530 1535 

Thr Val Ser Asp Val Pro Arg Asp Leu Glu Val Val Ala Ala Thr Pro 
1540 1545 1550 

Thr Ser Leu Leu He Ser Trp Asp Ala Pro Ala Val Thr Val Arg Tyr 
1555 1560 1565 

Tyr Arg He Thr Tyr Gly Glu Thr Gly Gly Asn Ser Pro Val Gin Glu 
1570 1575 1580 

Phe Thr Val Pro Gly Ser Lys Ser Thr Ala Thr He Ser Gly Leu Lys 
1585 1590 1595 1600 

Pro Gly Val Asp Tyr Thr He Thr Val Tyr Ala Val Thr Gly Arg Gly 
1605 1610 1615 

Asp Ser Pro Ala Ser Ser Lys Pro lie Ser He Asn Tyr Arg Thr Glu 
1620 1625 1630 

He Asp Lys Pro Ser Gin Met Gin Val Thr Asp Val Gin Asp Asn Ser 
1635 1640 1645 

He Ser Val Lys Trp Leu Pro Ser Ser Ser Pro Val Thr Gly Tyr Arq 
1650 1655 1660 

Val Thr Thr Thr Pro Lys Asn Gly Pro Gly Pro Thr Lys Thr Lys Thr 
1665 1670 1675 1680 

Ala Gly Pro Asp Gin Thr Glu Met Thr He Glu Gly Leu Gin Pro Thr 
1685 1690 1695 

Val Glu Tyr Val Val Ser Val Tyr Ala Gin Asn Pro Ser Gly Glu Ser 
1700 1705 1710 

Gin Pro Leu Val Gin Thr Ala Val Thr Asn He Asp Arg Pro Lys Gly 
1715 1720 1725 

Leu Ala Phe Thr Asp Val Asp Val Asp Ser He Lys He Ala Trp Glu 
1730 1735 1740 

Ser Pro Gin Gly Gin Val Ser Arg Tyr Arg Val Thr Tyr Ser Ser Pro 
1745 1750 1755 1760 
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61 u Asp Gly He His 61 u Leu Phe Pro Ala Pro Asp Gly Glu 61 u Asp 
1765 1770 1775 

Thr Ala 61 u Leu Gin Gly Leu Arg Pro 61y Ser 61 u Tyr Thr Val Ser 
1780 1785 1790 

Val Val Ala Leu His Asp Asp Met 61u Ser Gin Pro Leu He 61y Thr 
1795 1800 1805 

61 n Ser Thr Ala lie Pro Ala Pro Thr Asp Leu Lys Phe Thr 61 n Val 
1810 1815 1820 

Thr Pro Thr Ser Leu Ser Ala 61 n Trp Thr Pro Pro Asn Val 61 n Leu 
1825 1830 1835 1840 

Thr Gly Tyr Arg Val Arg Val Thr Pro Lys Glu Lys Thr Gly Pro Met 
1845 1850 1855 

Lys Glu He Asn Leu Ala Pro Asp Ser Ser Ser Val Val Val Ser Gly 
1860 1865 1870 

Leu Met Val Ala Thr Lys Tyr Glu Val Ser Val Tyr Ala Leu Lys Asp 
1875 1880 1885 

Thr Leu Thr Ser Arg Pro Ala Gin 61y Val Val Thr Thr Leu 61u Asn 
1890 1895 1900 

Val Ser Pro Pro Arg Arg Ala Arg Val Thr Asp Ala Thr 61 u Thr Thr 
1905 1910 1915 1920 

He Thr He Ser Trp Arg Thr Lys Thr 61 u Thr He Thr Gly Phe Gin 
1925 1930 1935 

Val Asp Ala Val Pro Ala Asn Gly Gin Thr Pro He 61n Arg Thr He 
1940 1945 1950 

Lys Pro Asp Val Arg Ser Tyr Thr He Thr 61y Leu 61 n Pro Gly Thr 
1955 1960 1965 

Asp Tyr Lys He Tyr Leu Tyr Thr Leu Asn Asp Asn Ala Arg Ser Ser 
1970 1975 1980 

Pro Val Val He Asp Ala Ser Thr Ala He Asp Ala Pro Ser Asn Leu 
1985 1990 1995 2000 

Arg Phe Leu Ala Thr Thr Pro Asn Ser Leu Leu Val Ser Trp Gin Pro 
2005 2010 2015 

Pro Arg Ala Arg He Thr Gly Tyr He He Lys Tyr Glu Lys Pro Gly 
2020 2025 2030 



Ser Pro Pro Arg Glu Val Val Pro Arg Pro Arg Pro Gly Val Thr Glu 
2035 2040 2045 
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Ala Thr lie Thr Gly Leu Glu Pro Gly Thr Glu Tyr Thr lie Tyr Val 
2050 2055 2060 

He Ala Leu Lys Asn Asn Gin Lys Ser Glu Pro Leu He Gly Arg Lys 
2065 2070 2075 2080 

Lys Thr Asp Glu Leu Pro Gin Leu Val Thr Leu Pro His Pro Asn Leu 
2085 2090 2095 

His Gly Pro Glu He Leu Asp Val Pro Ser Thr Val Gin Lys Thr Pro 
2100 2105 2110 

Phe Val Thr His Pro Gly Tyr Asp Thr Gly Asn Gly He Gin Leu Pro 
2115 2120 2125 

Gly Thr Ser Gly Gin Gin Pro Ser Val Gly Gin Gin Met He Phe Glu 
2130 2135 2140 

Glu His Gly Phe Arg Arg Thr Thr Pro Pro Thr Thr Ala Thr Pro He 
2145 2150 2155 2160 

Arg His Arg Pro Arg Pro Tyr Pro Pro Asn Val Gly Gin Glu Ala Leu 
2165 2170 2175 

Ser Gin Thr Thr He Ser Trp Ala Pro Phe Gin Asp Thr Ser Glu Tyr 
2180 2185 2190 

He lie Ser Cys His Pro Val Gly Thr Asp Glu Glu Pro Leu Gin Phe 
2195 2200 2205 

Arg Val Pro Gly Thr Ser Thr Ser Ala Thr Leu Thr Gly Leu Thr Arq 
2210 2215 2220 

Gly Ala Thr Tyr Asn He He Val Glu Ala Leu Lys Asp Gin Gin Arq 
2225 2230 2235 2240 

His Lys Val Arg Glu Glu Val Val Thr Val Gly Asn Ser Val Asn Glu 
2245 2250 2255 

Gly Leu Asn Gin Pro Thr Asp Asp Ser Cys Phe Asp Pro Tyr Thr Val 
2260 2265 2270 

Ser His Tyr Ala Val Gly Asp Glu Trp Glu Arg Met Ser Glu Ser Gly 
2275 2280 2285 

Phe Lys Leu Leu Cys Gin Cys Leu Gly Phe Gly Ser Gly His Phe Arq 
2290 2295 2300 

Cys Asp Ser Ser Arg Trp Cys His Asp Asn Gly Val Asn Tyr Lys He 
2305 2310 2315 2320 

Gly Glu Lys Trp Asp Arg Gin Gly Glu Asn Gly Gin Met Met Ser Cys 
2325 2330 2335 
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Thr Cys Leu Gly Asn Gly Lys Gly Glu Phe Lys Cys Asp Pro His Glu 
2340 2345 2350 

Ala Thr Cys Tyr Asp Asp Gly Lys Thr Tyr His Val Gly Glu Gin Trp 
2355 2360 2365 

Gin Lys Glu Tyr Leu Gly Ala He Cys Ser Cys Thr Cys Phe Gly Gly 
2370 2375 2380 

Gin Arg Gly Trp Arg Cys Asp Asn Cys Arg Arg Pro Gly Gly Glu Pro 
2385 2390 2395 2400 

Ser Pro Glu Gly Thr Thr Gly Gin Ser Tyr Asn Gin Tyr Ser Gin Arg 
2405 2410 2415 

Tyr His Gin Arg Thr Asn Thr Asn Val Asn Cys Pro He Glu Cys Phe 
2420 2425 2430 

Met Pro Leu Asp Val Gin Ala Asp Arg Glu Asp Ser Arg Glu 
2435 2440 2445 

(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2179 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 31.. 1962 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

GTCTAGGAGC CAGCCCCACC CTTAGAAAAG ATG TTT TCC ATG AGG ATC GTC TGC 54 

Met Phe Ser Met Arg He Val Cys 
1 5 

CTA GTT CTA AGT GTG GTG GGC ACA GCA TGG ACT GCA GAT AGT GGT GAA 102 
Leu Val Leu Ser Val Val Gly Thr Ala Trp Thr Ala Asp Ser Gly Glu 
10 15 20 

GGT GAC TTT CTA GCT GAA GGA GGA GGC GTG CGT GGC CCA AGG GTT GTG 150 
Gly Asp Phe Leu Ala Glu Gly Gly Gly Val Arg Gly Pro Arg Val Val 
25 30 35 40 

GAA AGA CAT CAA TCT GCC TGC AAA GAT TCA GAC TGG CCC TTC TGC TCT 198 
Glu Arg His Gin Ser Ala Cys Lys Asp Ser Asp Trp Pro Phe Cys Ser 
45 50 55 
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GAT GAA 
Asp Glu 



TTG ATT 
Leu lie 



AAA AAT 
Lys Asn 
90 

TTG ACC 
Leu Thr 
105 

AAT AAC 
Asn Asn 



GAC TGG 
Asp Trp 
60 

GAT GAA 
Asp Glu 
75 

TCA CTA 
Ser Leu 



ACT AAT 
Thr Asn 



CGT GAT 
Arg Asp 



AGA ATT 
Arg lie 



CAG CTT 
Gin Leu 



CTG GAG 
Leu Glu 
170 

AGT AGG 
Ser Arg 
185 

CAG AAG 
Gin Lys 



GAA GTC 
Glu Val 
140 

CTG CAG 
Leu Gin 
155 

GTG GAC 
Val Asp 



AAC TAC 
Asn Tyr 



GTC AAT 
Val Asn 



TTT GAA 
Phe Glu 



ATA ATG 
He Met 
110 

AAT ACC 
Asn Thr 
125 

CTG AAG 
Leu Lys 



AAA TGC 
Lys Cys 



CAA GAT 
Gin Asp 
80 

TAT CAG 
Tyr Gin 
95 

GAA ATT 
Glu He 



TAC AAC 
Tyr Asn 



CGC AAA 
Arg Lys 



AAA AAT 
Lys Asn 

ATT GAT 
He Asp 



GCT TTA 
Ala Leu 



CAA CTT 
Gin Leu 



GAT AGG 
Asp Arg 



GTT CCC 
Val Pro 



AAG GCA 
Lys Ala 
250 



CAA CAC 
Gin His 
220 

GGA AAT 
Gly Asn 
235 

TTA ACA 
Leu Thr 



GCT CGT 
Ala Arg 
190 

GAA CAG 
Glu Gin 
205 

TTA CCA 
Leu Pro 



GTT AGA 
Val Arg 
160 

ATT AAG 
lie Lys 
175 

GAA GTA 
Glu Val 



CCT TCT GGC 
Pro Ser Gly 
65 

TTT ACA AAC 
Phe Thr Asn 



AAG AAC AAT 
Lys Asn Asn 



TTG AGA GGC 
Leu Arg Gly 
115 

CGA GTG TCA 
Arg Val Ser 
130 

GTC ATA GAA 
Val He Glu 
145 

GCT CAG TTG 
Ala Gin Leu 



TGC AGG ATG 
Cys Arg Met 
70 

AGA ATA AAT 
Arg He Asn 
85 

AAG GAT TCT 
Lys Asp Ser 
100 

GAT TTT TCC 
Asp Phe Ser 



AAA GGG 
Lys Gly 

AAG CTC 
Lys Leu 

CAT TCG 
His Ser 



GAG GAT CTG 
Glu Asp Leu 



ATC CGA TCT 
He Arg Ser 



GTC ATT 
Val He 



CTG ATA 
Leu He 



GGT GGA 
Gly Gly 
265 



AAT GAG 
Asn Glu 



TTT AAG 
Phe Lys 



GAC ATG 
Asp Met 



ATT ACT 
He Thr 
270 



AGC CAG 
Ser Gin 
240 

CCG CAG 
Pro Gin 
255 

CGA GGA 
Arg Gly 



GAT CTG AAG 
Asp Leu Lys 
195 

GCC AAA GAC 
Ala Lys Asp 
210 

AAA ATG AAA 
Lys Met Lys 
225 

CTT CAG AAG 
Leu Gin Lys 



ATG AGA ATG 
Met Arg Met 



AAA GTA CAG 
Lys Val Gin 
150 

GTT GAT ATG 
Val Asp Met 
165 

TGT CGA GGG 
Cys Arg Gly 
180 

GAC TAT GAA 
Asp Tyr Glu 



TCA GCC 
Ser Ala 
120 

AGA AGC 
Arg Ser 
135 

CAT ATC 
His He 



AAA CGA 
Lys Arg 

TCA TGG 
Ser Trp 



TTA CTT CCC 
Leu Leu Pro 



CCA GTT CCA 
Pro Val Pro 
230 

GTA CCC CCA 
Val Pro Pro 
245 

GAG TTA GAG 
Glu Leu Glu 
260 



GGC TCC ACC 
Gly Ser Thr 
275 



TCT TAT GGA 
Ser Tyr Gly 



GAT CAG 
Asp Gin 
200 

TCT AGA 
Ser Arg 
215 

GAC TTG 
Asp Leu 



GAG TGG 
Glu Trp 

AGA CCT 
Arg Pro 



ACC GGA 
Thr Gly 
280 



246 



294 



342 



390 



438 



486 



534 



582 



630 



678 



726 



774 



822 



870 
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TCA GAG ACG 
Ser Glu Thr 



TCT GGG AGC 
Ser Gly Ser 



TCT GGG ACT 
Ser Gly Thr 
315 

GGA AGT GCT 
Gly Ser Ala 
330 

GGA AAC CAA 
Gly Asn Gin 
345 

AAT CCT GGC 
Asn Pro Gly 



AGC TCT GTA 
Ser Ser Val 



TTT AGG CCA 
Phe Arg Pro 
395 

GAC TGG GGC 
Asp Trp Gly 
410 

AGG AGA GAG 
Arg Arg Glu 
425 

CTC AGG ACT 
Leu Arg Thr 



CGT CGT TCA 
Arg Arg Ser 



GGT CAC AAA 
Gly His Lys 
475 



GAA AGC CCC AGG AAC CCT AGC AGT GCT GGA AGC TGG AAC 918 
Glu Ser Pro Arg Asn Pro Ser Ser Ala Gly Ser Trp Asn 
285 290 295 

TCT GGA CCT GGA AGT ACT GGA AAC CGA AAC CCT GGG AGC 966 
Ser Gly Pro Gly Ser Thr Gly Asn Arg Asn Pro Gly Ser 
300 305 310 

GGA GGG ACT GCA ACC TGG AAA CCT GGG AGC TCT GGA CCT 1014 
Gly Gly Thr Ala Thr Trp Lys Pro Gly Ser Ser Gly Pro 
320 325 

GGA AGC TGG AAC TCT GGG AGC TCT GGA ACT GGA AGT ACT 1062 
Gly Ser Trp Asn Ser Gly Ser Ser Gly Thr Gly Ser Thr 
335 340 

AAC CCT GGA AGT CCT AGA CCT GGT AGT ACC GGA ACC TGG 1110 
Asn Pro Gly Ser Pro Arg Pro Gly Ser Thr Gly Thr Trp 
350 355 ' 360 

AGC TCT GAA CGC GGA AGT GCT GGG CAC TGG ACC TCT GAG 1158 
Ser Ser Glu Arg Gly Ser Ala Gly His Trp Thr Ser Glu 
365 370 375 

TCT GGT AGT ACT GGA CAA TGG CAC TCT GAA TCT GGA AGT 1206 

Ser Gly Ser Thr Gly Gin Trp His Ser Glu Ser Gly Ser 

380 385 390 

GAT AGC CCA GGC TCT GGG AAC GCG AGG CCT AAC AAC CCA 1254 

Asp Ser Pro Gly Ser Gly Asn Ala Arg Pro Asn Asn Pro 

400 405 

ACA TTT GAA GAG GTG TCA GGA AAT GTA AGT CCA GGG ACA 1302 
Thr Phe Glu Glu Val Ser Gly Asn Val Ser Pro Gly Thr 
415 420 

TAC CAC ACA GAA AAA CTG GTC ACT AAA GGA GAT AAA GAG 1350 
Tyr His Thr Glu Lys Leu Val Thr Lys Gly Asp Lys Glu 
430 435 440 

GGT AAA GAG AAG GTC ACC TCT GGT AGC ACA ACC ACC ACG 1398 
Gly Lys Glu Lys Val Thr Ser Gly Ser Thr Thr Thr Thr 
445 450 455 

TGC TCT AAA ACC GTT ACT AAG ACT GTT ATT GGT CCT GAT 1446 
Cys Ser Lys Thr Val Thr Lys Thr Val lie Gly Pro Asp 
460 465 470 

GAA GTT ACC AAA GAA GTG GTG ACC TCC GAA GAT GGT TCT 1494 
Glu Val Thr Lys Glu Val Val Thr Ser Glu Asp Gly Ser 
480 485 
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GAC TGT CCC GAG GCA ATG GAT TTA GGC ACA TTG TCT- GGC ATA GGT ACT 1542 
Asp Cys Pro Glu Ala Met Asp Leu Gly Thr Leu Ser Gly He Gly Thr 
490 495 500 

CTG GAT GGG TTC CGT CAT AGG CAC CCT GAT GAA GCT GCC TTC TTC GAC 1590 
Leu Asp Gly Phe Arg His Arg His Pro Asp Glu Ala Ala Phe Phe Asp 
505 510 515 520 

ACT GCC TCA ACT GGA AAA ACA TTC CCA GGT TTC TTC TCA CCT ATG TTA 1638 
Thr Ala Ser Thr Gly Lys Thr Phe Pro Gly Phe Phe Ser Pro Met Leu 
525 530 535 

GGA GAG TTT GTC AGT GAG ACT GAG TCT AGG GGC TCA GAA TCT GGC ATC 1686 
Gly Glu Phe Val Ser Glu Thr Glu Ser Arg Gly Ser Glu Ser Gly He 
540 545 " 550 

TTC ACA AAT ACA AAG GAA TCC AGT TCT CAT CAC CCT GGG ATA GCT GAA 1734 
Phe Thr Asn Thr Lys Glu Ser Ser Ser His His Pro Gly lie Ala Glu 
555 560 565 

TTC CCT TCC CGT GGT AAA TCT TCA AGT TAC AGC AAA CAA TTT ACT AGT 1782 
Phe Pro Ser Arg Gly Lys Ser Ser Ser Tyr Ser Lys Gin Phe Thr Ser 
570 575 580 

AGC ACG AGT TAC AAC AGA GGA GAC TCC ACA TTT GAA AGC AAG AGC TAT 1830 
Ser Thr Ser Tyr Asn Arg Gly Asp Ser Thr Phe Glu Ser Lys Ser Tyr 
585 590 595 600 

AAA ATG GCA GAT GAG GCC GGA AGT GAA GCC GAT CAT GAA GGA ACA CAT 1878 
Lys Met Ala Asp Glu Ala Gly Ser Glu Ala Asp His Glu Gly Thr His 
605 610 615 

AGC ACC AAG AGA GGG CAT GCT AAA TCT CGC CCT GTC AGA GGT ATC CAC 1926 
Ser Thr Lys Arg Gly His Ala Lys Ser Arg Pro Val Arg Gly He His 
520 625 630 

ACT TCT CCT TTG GGG AAG CCT TCC CTG TCC CCC TAGACTAAGT TAAATATTTC 1979 
Thr Ser Pro Leu Gly Lys Pro Ser Leu Ser Pro 
635 640 

TGCACAGTGT TCCCATGGCC CCTTGCATTT CCTTCTTAAC TCTCTGTTAC ACGTCATTGA 

AACTACACTT TTTTGGTCTG TTTTTGTGCT AGACTGTAAG TTCCTTGGGG GCAGGGCCTT 

TGTCTGTCTC ATCTCTGTAT TCCCAAATGC CTAACAGTAC AGAGCCATGA CTCAATAAAT 

ACATGTTAAA TGGATGAATG 



2039 
2099 
2159 
2179 
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(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Phe Ser Met Arg lie Val Cys Leu Val Leu Ser Val Val Gly Thr 
15 10 15 

Ala Trp Thr Ala Asp Ser Gly Glu Gly Asp Phe Leu Ala Glu Gly Gly 
20 25 30 

Gly Val Arg Gly Pro Arg Val Val Glu Arg His Gin Ser Ala Cys Lys 
35 40 ' 45 

Asp Ser Asp Trp Pro Phe Cys Ser Asp Glu Asp Trp Asn Tyr Lys Cvs 
50 55 60 

Pro Ser Gly Cys Arg Met Lys Gly Leu He Asp Glu Val Asn Gin Asp 
65 70 75 80 

Phe Thr Asn Arg He Asn Lys Leu Lys Asn Ser Leu Phe Glu Tyr Gin 
85 90 95 

Lys Asn Asn Lys Asp Ser His Ser Leu Thr Thr Asn He Met Glu lie 
100 105 no 

Leu Arg Gly Asp Phe Ser Ser Ala Asn Asn Arg Asp Asn Thr Tyr Asn 
115 120 125 

Arg Val Ser Glu Asp Leu Arg Ser Arg lie Glu Val Leu Lys Arq Lvs 
130 135 140 

Val He Glu Lys Val Gin His He Gin Leu Leu Gin Lys Asn Val Arq 
145 150 155 160 

Ala Gin Leu Val Asp Met Lys Arg Leu Glu Val Asp He Asp He Lys 
165 170 175 

He Arg Ser Cys Arg Gly Ser Trp Ser Arg Ala Leu Ala Arg Glu Val 
180 185 190 

Asp Leu Lys Asp Tyr Glu Asp Gin Gin Lys Gin Leu Glu Gin Val He 
195 200 205 

Ala Lys Asp Leu Leu Pro Ser Arg Asp Arg Gin His Leu Pro Leu He 
210 215 " 220 

Lys Met Lys Pro Val Pro Asp Leu Val Pro Gly Asn Phe Lys Ser Gin 
225 230 235 240 
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Leu Gin Lys Val Pro Pro Glu Trp Lys Ala Leu Thr Asp Met Pro Gin 
245 250 255 

Met Arg Met Glu Leu Glu Arg Pro Gly Gly Asn Glu He Thr Arg Gly 
250 265 270 

Gly Ser Thr Ser Tyr Gly Thr Gly Ser Glu Thr Glu Ser Pro Arg Asn 
275 280 285 

Pro Ser Ser Ala Gly Ser Trp Asn Ser Gly Ser Ser Gly Pro Gly Ser 
290 295 300 

Thr Gly Asn Arg Asn Pro Gly Ser Ser Gly Thr Gly Gly Thr Ala Thr 
305 310 315 320 

Trp Lys Pro Gly Ser Ser Gly Pro Gly Ser Ala Gly Ser Trp Asn Ser 
325 330 335 

Gly Ser Ser Gly Thr Gly Ser Thr Gly Asn Gin Asn Pro Gly Ser Pro 
340 345 350 

Arg Pro Gly Ser Thr Gly Thr Trp Asn Pro Gly Ser Ser Glu Arg Gly 
355 360 365 

Ser Ala Gly His Trp Thr Ser Glu Ser Ser Val Ser Gly Ser Thr Gly 
370 375 380 

Gin Trp His Ser Glu Ser Gly Ser Phe Arg Pro Asp Ser Pro Gly Ser 
385 390 395 400 

Gly Asn Ala Arg Pro Asn Asn Pro Asp Trp Gly Thr Phe Glu Glu Val 
405 410. 415 

Ser Gly Asn Val Ser Pro Gly Thr Arg Arg Glu Tyr His Thr Glu Lys 
420 425 430 

Leu Val Thr Lys Gly Asp Lys Glu Leu Arg Thr Gly Lys Glu Lys Val 
435 440 445 

Thr Ser Gly Ser Thr Thr Thr Thr Arg Arg Ser Cys Ser Lys Thr Val 
450 455 460 

Thr Lys Thr Val He Gly Pro Asp Gly His Lys Glu Val Thr Lys Glu 
465 470 475 480 

Val Val Thr Ser Glu Asp Gly Ser Asp Cys Pro Glu Ala Met Asp Leu 
485 490 495 

Gly Thr Leu Ser Gly He Gly Thr Leu Asp Gly Phe Arg His Arg His 
500 505 510 

Pro Asp Glu Ala Ala Phe Phe Asp Thr Ala Ser Thr Gly Lys Thr Phe 
515 520 525 
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Pro Gly Phe Phe Ser Pro Met Leu Gly Gl u Phe Val Ser Glu Thr Glu 
530 535 ' 540 

Ser Arg Gly Ser Glu Ser Gly lie Phe Thr Asn Thr Lys Glu Ser Ser 
545 550 555 560 

Ser His His Pro Gly He Ala Glu Phe Pro Ser Arg Gly Lys Ser Ser 
565 570 575 

Ser Tyr Ser Lys Gin Phe Thr Ser Ser Thr Ser Tyr Asn Arg Gly Asp 
580 585 590 

Ser Thr Phe Glu Ser Lys Ser Tyr Lys Met Ala Asp Glu Ala Gly Ser 
595 600 605 

Glu Ala Asp His Glu Gly Thr His Ser Thr Lys Arg Gly His Ala Lvs 
610 615 620 

Ser Arg Pro Val Arg Gly He His Thr Ser Pro Leu Gly Lys Pro Ser 
625 630 635 " 640 

Leu Ser Pro 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4027 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3.. 4013 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

AC ATG GCA GTG AGT CAT GGG AGG GAG AGC AAG CCT CTG ACT GCT CAA 47 
Met Ala Val Ser His Gly Arg Glu Ser Lys Pro Leu Thr Ala Gin 
15 10 15 

CAG ACA ACC AAA CTG GAT GCT CCC ACT AAC CTC CAG TTT GTC AAT GAA 95 
Gin Thr Thr Lys Leu Asp Ala Pro Thr Asn Leu Gin Phe Val Asn Glu 
20 25 30 

ACT GAT TCT ACT GTC CTG GTG AGA TGG ACT CCA CCT CGG GCC CAG ATA 143 
Thr Asp Ser Thr Val Leu Val Arg Trp Thr Pro Pro Arg Ala Gin He 
35 40 45 
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ACA GGA TAC 
Thr Gly Tyr 
50 

CAG TAC AAT 
Gin Tyr Asn 
65 

CAG CCT GCA 
Gin Pro Ala 
80 

CAA GAG AGC 
Gin Glu Ser 



CGA CTG 
Arg Leu 

GTG GGT 
Val Gly 



TCT GAG 
Ser Glu 



AGC TCT ATT 
Ser Ser He 



ATC ACA TGG 
He Thr Trp 
130 

CCA AGC CAG 
Pro Ser Gin 
145 

AGC ATC GTT 
Ser He Val 
160 

ATC CAA GTC 
He Gin Val 



CCC AAA 
Pro Lys 
100 

CCA CCT 
Pro Pro 
115 

ACG CCT 
Thr Pro 



ACC GTG GGC 
Thr Val Gly 
55 

CCC TCT GTC 
Pro Ser Val 
70 

TAC ACC GTA 
Tyr Thr Val 
85 

GCC ACT GGA 
Ala Thr Gly 



CTT'ACC CGA AGA 
Leu Thr Arg Arg 



TAC AAC ACC 
Tyr Asn Thr 



GGA GGA 
Gly Gly 



GTG TCC 
Val Ser 



AAA GTG GTG 
Lys Val Val 



AAC CCT GAC 
Asn Pro Asp 
210 

CCA GAC ATT 
Pro Asp He 
225 

CAG GGA AAT 
Gin Gly Asn 
240 



CTG AGA 
Leu Arg 
180 

ACA CCA 
Thr Pro 
195 

ACT GGA 
Thr Gly 



GCT CCA AGA 
Ala Pro Arg 
135 

GAG GCA CCA 
Glu Ala Pro 
150 

GGC TTG ACT 
Gly Leu Thr 
165 

GAT GGA CAG 
Asp Gly Gin 



TCC AAG TAC CCC 
Ser Lys Tyr Pro 
75 

TCC CTC GTG GCC 
Ser Leu Val Ala 
90 

GTC TTT ACC ACA 
Val Phe Thr Thr 
105 

GAG GTG ACT GAG 
Glu Val Thr Glu 
120 

ATT GGT TTT AAG 
He Gly Phe Lys 



TTG TCT CCA 
Leu Ser Pro 



ACT GGT 
Thr Gly 



TCT TTG 
Ser Leu 



ACT TTT GAT 
Thr Phe Asp 



AAC CTG 
Asn Leu 
260 



GTG CTC ACA 
Val Leu Thr 
215 

TAT AGA ATT 
Tyr Arg He 
230 

GAA GAA GTG 
Glu Glu Val 
245 

AGT CCC GGC 
Ser Pro Gly 



CGA GAA GTG ACT 
Arg Glu Val Thr 
155 

CCA GGA GTA GAA 
Pro Gly Val Glu 
170 

GAA AGA GAT GCG 
Glu Arg Asp Ala 
185 

CCA ACA AAC TTG 
Pro Thr Asn Leu 
200 

GTC TCC TGG GAG 
Val Ser Trp Glu 



GGC CAG CCC AGG 
Gly Gin Pro Arg 
60 

CTG AGG AAT CTG 
Leu Arg Asn Leu 



ATA AAG GGC AAC 
He Lys Gly Asn 
95 

CTG CAG CCT GGG 
Leu Gin Pro Gly 
110 

ACC ACC ATC GTG 
Thr Thr He Val 
125 

CTG GGT GTA CGA 
Leu Gly Val Arg 
140 

TCA GAC TCA GGA 
Ser Asp Ser Gly 



TAC GTC 
Tyr Val 



CCA ATT 
Pro He 



ACC ACA ACC CCT 
Thr Thr Thr Pro 
235 

GTC CAT GCT GAT 
Val His Ala Asp 
250 

CTG GAG TAC AAT 
Leu Glu Tyr Asn 
265 



CAT CTG 
His Leu 
205 

AGG AGC 
Arg Ser 
220 

ACA AAC 
Thr Asn 



TAC ACC 
Tyr Thr 
175 

GTA AAC 
Val Asn 
190 

GAG GCA 
Glu Ala 



ACC ACC 
Thr Thr 



GGC CAG 
Gly Gin 



CAG AGC TCC TGC 
Gin Ser Ser Cys 
255 

GTC AGT GTT TAC 
Val Ser Val Tyr 
270 



191 



239 



287 



335 



383 



431 



479 



527 



575 



623 



671 



719 



767 



815 
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ACT GTC AAG GAT GAC AAG GAA AGT GTC CCT ATC TCT GAT 
Thr Val Lys Asp Asp Lys Glu Ser Val Pro lie Ser Asp 
275 280 

CCA GAG GTG CCC CAA CTC ACT GAC CTA AGC TTT GTT GAT 
Pro Glu Val Pro Gin Leu Thr Asp Leu Ser Phe Val Asp 
290 295 300 

TCA AGC ATC GGC CTG AGG TGG ACC CCG CTA AAC TCT TCC 
Ser Ser lie Gly Leu Arg Trp Thr Pro Leu Asn Ser Ser 
305 310 315 

GGG TAC CGC ATC ACA GTA GTT GCG GCA GGA GAA GGT ATC 
Gly Tyr Arg He Thr Val Val Ala Ala Gly Glu Gly He 
320 325 330 

GAA GAT TTT GTG TAC TCC TCA GTA GGA TAC TAC ACA GTC 
Glu Asp Phe Val Tyr Ser Ser Val Gly Tyr Tyr Thr Val 
340 345 

GAG CCG GGC ATT GAC TAT GAT ATC AGC GTT ATC ACT CTC 
Glu Pro Gly He Asp Tyr Asp He Ser Val lie Thr Leu 
355 360 

GGC GAG AGT GCC CCT ACT ACA CTG ACA CAA CAA ACG GCT 
Gly Glu Ser Ala Pro Thr Thr Leu Thr Gin Gin Thr Ala 
370 375 380 

CCC ACT GAC CTG CGA TTC ACC AAC ATT GGT CCA GAC ACC 
Pro Thr Asp Leu Arg Phe Thr Asn He Gly Pro Asp Thr 
385 390 395 

ACC TGG GCT CCA CCC CCA TCC ATT GAT TTA ACC AAC TTC 
Thr Trp Ala Pro Pro Pro Ser He Asp Leu Thr Asn Phe 
400 405 410 

TAC TCA CCT GTG AAA AAT GAG GAA GAT GTT GCA GAG TTG 
Tyr Ser Pro Val Lys Asn Glu Glu Asp Val Ala Glu Leu 
420 425 

CCT TCA GAC AAT GCA GTG GTC TTA ACA AAT CTC CTG CCT 
Pro Ser Asp Asn Ala Val Val Leu Thr Asn Leu Leu Pro 
435 440 

TAT GTA GTG AGT GTC TCC AGT GTC TAC GAA CAA CAT GAG 
Tyr Val Val Ser Val Ser Ser Val Tyr Glu Gin His Glu 
450 455 460 

CTT AGA GGA AGA CAG AAA ACA GGT CTT GAT TCC CCA ACT 
Leu Arg Gly Arg Gin Lys Thr Gly Leu Asp Ser Pro Thr 
465 470 475 



ACC ATC ATC 863 

Thr He He 

285 

ATA ACC GAT 911 
He Thr Asp 



ACC ATT ATT 959 
Thr He He 



CCT ATT TTT 1007 
Pro He Phe 
335 

ACA GGG CTG 1055 
Thr Gly Leu 
350 

ATT AAT GGC 1103 

He Asn Gly 

365 

GTT CCT CCT 1151 
Val Pro Pro 



ATG CGT GTC 1199 
Met Arg Val 



CTG GTG CGT 1247 
Leu Val Arg 
415 

TCA ATT TCT 1295 
Ser lie Ser 
430 

GGT ACA GAA 1343 

Gly Thr Glu 

445 

AGC ACA CCT 1391 
Ser Thr Pro 



GGC ATT GAC 1439 
Gly He Asp 
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TTT TCT GAT 
Phe Ser Asp 
480 

CGA GCC ACC 
Arg Ala Thr 



AGT GGG AGA 
Ser Gly Arg 



ACC CTC ACC 
Thr Leu Thr 
530 

GCT CTT AAT 
Ala Leu Asn 
545 

ACA GTT TCT 
Thr Val Ser 
550 

ACC AGC CTA 
Thr Ser Leu 



ATT ACT GCC 
He Thr Ala 
485 

ATC ACT GGC 
lie Thr Gly 
500 

CCT CGA GAA 
Pro Arg Glu 
515 

AAC CTC ACT 
Asn Leu Thr 



AAC TCT 
Asn Ser 



TAC AGG 
Tyr Arg 

GAT CGG 
Asp Arg 



GGC AGA GAG 
Gly Arg Glu 



TAC AGG ATC 
Tyr Arg lie 



TTC ACT GTG 
Phe Thr Val 
610 

CCT GGA GTT 
Pro Gly Val 
625 

GAC AGC CCC 
Asp Ser Pro 
640 

ATT GAC AAA 
He Asp Lys 



GAT GTT CCG 
Asp Val Pro 
565 

CTG ATC AGC 
Leu He Ser 
580 

ACT TAC GGA 
Thr Tyr Gly 
595 

CCT GGG AGC 
Pro Gly Ser 



CCA GGC 
Pro Gly 
535 

GAA AGT 
Glu Ser 
550 

AGG GAC 
Arg Asp 



TTT ACT 
Phe Thr 



ATC CGC 
He Arg 
505 

GTG CCC 
Val Pro 
520 

ACA GAG 
Thr Glu 



GTG CAC 
Val His 
490 

CAT CAT 
His His 



CAC TCT 
His Ser 



TAT GTG 
Tyr Val 



CCC TTA 
Pro Leu 



CTG GAA 
Leu Glu 



TTG ATT 
Leu He 
555 

GTT GTT 
Val Val 
570 

GCT GTC 
Ala Val 



TGG ATT GCT CCT 
Trp He Ala Pro 
495 

CCC GAG CAC TTC 
Pro Glu His Phe 
510 

CGG AAT TCC ATC 
Arg Asn Ser He 
525 

GTC AGC ATC GTT 
Val Ser He Val 
540 

GGC CAA CAA TCA 
Gly Gin Gin Ser 



GAT TAT ACC 
Asp Tyr Thr 



GCA AGC AGC 
Ala Ser Ser 
645 

CCA TCC CAG 
Pro Ser Gin 
660 



ATT AGT GTC 
He Ser Val 



GTA ACC ACC 
Val Thr Thr 
690 



AAG TGG CTG 
Lys Trp Leu 
675 

ACT CCC AAA 
Thr Pro Lys 



GCT GCG ACC CCC 
Ala Ala Thr Pro 
575 

TGG GAT GCT CCT GCT GTC ACA GTG AGA TAT 
Trp Asp Ala Pro Ala Val Thr Val Arg Tyr 
585 590 

GAA ACA GGA GGA AAT AGC CCT GTC CAG GAG 
Glu Thr Gly Gly Asn Ser Pro Val Gin Glu 
600 605 

AAG TCT ACA GCT ACC ATC AGC GGC CTT AAA 
Lys Ser Thr Ala Thr He Ser Gly Leu Lys 
615 520 

ATC ACT GTG TAT GCT GTC ACT GGC CGT GGA 
He Thr Val Tyr Ala Val Thr Gly Arg Gly 
630 635 

AAG CCA ATT TCC ATT AAT TAC CGA ACA GAA 
Lys Pro He Ser He Asn Tyr Arg Thr Glu 
650 655 

ATG CAA GTG ACC GAT GTT CAG GAC AAC AGC 
Met Gin Val Thr Asp Val Gin Asp Asn Ser 
665 670 

CCT GTT ACT GGT TAC AGA 
Pro Val Thr Gly Tyr Arg 
685 



CCT TCA 
Pro Ser 



AAT GGA 
Asn Gly 
695 



AGT TCC 
Ser Ser 
680 

CCA GGA 
Pro Gly 



CCA ACA AAA ACT AAA ACT 
Pro Thr Lys Thr Lys Thr 
700 



1487 



1535 



1583 



1631 



1679 



1727 



1775 



1823 



1871 



1919 



1967 



2015 



2063 



2111 
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GCA GGT CCA GAT 
Ala Gly Pro Asp 
705 

GTG GAG TAT GTG 
Val Glu Tyr Val 
720 

CAG CCT CTG GTT 
Gin Pro Leu Val 



CTG GCA TTC ACT 
Leu Ala Phe Thr 
755 

AGC CCA CAG GGG 
Ser Pro Gin Gly 
770 

GAG GAT GGA ATC 
Glu Asp Gly lie 
785 

ACT GCA GAG CTG 
Thr Ala Glu Leu 
800 

GTG GTT GCC TTG 
Val Val Ala Leu 



CAA ACA GAA ATG 
Gin Thr Glu Met 
710 

GTT AGT GTC TAT 
Val Ser Val Tyr 
725 

CAG ACT GCA GTA 
Gin Thr Ala Val 
740 

GAT GTG GAT GTC 
Asp Val Asp Val 



ACT ATT 
Thr He 



GCT CAG 
Ala Gin 



ACC AAC 
Thr Asn 
745 

GAT TCC 
Asp Ser 
760 

TAC AGG 
Tyr Arg 



GAA GGC TTG 
Glu Gly Leu 
715 

AAT CCA AGC 
Asn Pro Ser 
730 

ATT GAT CGC 
He Asp Arg 



ATC AAA ATT 
He Lys He 



CAG TCC ACA GCT 
Gin Ser Thr Ala 
835 

ACA CCC ACA AGC 
Thr Pro Thr Ser 
850 

ACT GGA TAT CGA 
Thr Gly Tyr Arg 
865 

AAA GAA ATC AAC 
Lys Glu He Asn 
880 

CTT ATG GTG GCC 
Leu Met Val Ala 



CAA GTT TCC AGG 
Gin Val Ser Arg 
775 

CAT GAG CTA TTC CCT GCA 
His Glu Leu Phe Pro Ala 
790 

CAA GGC CTC AGA CCG GGT 
Gin Gly Leu Arg Pro Gly 
805 

CAC GAT GAT ATG GAG AGC 
His Asp Asp Met Glu Ser 
820 825 

ATT CCT GCA CCA ACT GAC 
He Pro Ala Pro Thr Asp 
840 

CTG AGC GCC CAG TGG ACA 
Leu Ser Ala Gin Trp Thr 
855 

GTG CGG GTG ACC CCC AAG 
Val Arg Val Thr Pro Lys 
870 

CTT GCT CCT GAC AGC TCA 
Leu Ala Pro Asp Ser Ser 
885 

ACC AAA TAT GAA GTG AGT 
Thr Lys Tyr Glu Val Ser 
900 905 



GTG ACC TAC 
Val Thr Tyr 
780 

CCT GAT GGT 
Pro Asp Gly 
795 

TCT GAG TAC 
Ser Glu Tyr 
810 

CAG CCC CTG 
Gin Pro Leu 



CAG CCC ACA 2159 
Gin Pro Thr 



GGA GAG AGT 2207 
Gly Glu Ser 
735 

CCT AAA GGA 2255 
Pro Lys Gly 
750 

GCT TGG GAA 2303 

Ala Trp Glu 

765 

TCG AGC CCT 2351 
Ser Ser Pro 



CTG AAG TTC 
Leu Lys Phe 



CCA CCC AAT 
Pro Pro Asn 
860 

GAG AAG ACC 
Glu Lys Thr 
875 

TCC GTG GTT 
Ser Val Val 
890 

GTC TAT GCT 
Val Tyr Ala 



GAA GAA GAC 2399 
Glu Glu Asp 

ACA GTC AGT 2447 
Thr Val Ser 
815 

ATT GGA ACC 2495 
He Gly Thr 
830 

ACT CAG GTC 2543 

Thr Gin Val 

845 

GTT CAG CTC 2591 
Val Gin Leu 



GGA CCA ATG 2639 
Gly Pro Met 



GTA TCA GGA 2687 
Val Ser Gly 
895 

CTT AAG GAC 2735 
Leu Lys Asp 
910 
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ACT TTG ACA AGC AGA CCA GCT CAG GGT GTT GTC ACC ACT CTG GAG GGA 2783 
Thr Leu Thr Ser Arg Pro Ala Gin Gly Val Val Thr Thr Leu Glu Gl y 
915 920 925 

GGA AAT TTT AAG AGC CAG CTT CAG AAG GTA CCC CCA GAG TGG AAG GCA 2831 
Gly Asn Phe Lys Ser Gin Leu Gin Lys Val Pro Pro Glu Trp Lys Ala 
930 935 940 

TTA ACA GAC ATG CCG CAG ATG AGA ATG GAG TTA GAG AGA CCT GGT GGA 2879 
Leu Thr Asp Met Pro Gin Met Arg Met Glu Leu Glu Arg Pro Gly Gly 
945 950 955 

AAT GAG ATT ACT CGA GGA GGC TCC ACC TCT TAT GGA ACC GGA TCA GAG 2927 
Asn Glu He Thr Arg Gly Gly Ser Thr Ser Tyr Gly Thr Gly Ser Glu 
960 965 970 975 

ACG GAA AGC CCC AGG AAC CCT AGC AGT GCT GGA AGC TGG AAC TCT GGG 2975 
Thr Glu Ser Pro Arg Asn Pro Ser Ser Ala Gly Ser Trp Asn Ser Gly 
980 985 990 

AGC TCT GGA CCT GGA AGT ACT GGA AAC CGA AAC CCT GGG AGC TCT GGG 3023 
Ser Ser Gly Pro Gly Ser Thr Gly Asn Arg Asn Pro Gly Ser Ser Gly 
995 1000 1005 

ACT GGA GGG ACT GCA ACC TGG AAA CCT GGG AGC TCT GGA CCT GGA AGT 3071 
Thr Gly Gly Thr Ala Thr Trp Lys Pro Gly Ser Ser Gly Pro Gly Ser 
1010 1015 1020 

GCT GGA AGC TGG AAC TCT GGG AGC TCT GGA ACT GGA AGT ACT GGA AAC 3119 
Ala Gly Ser Trp Asn Ser Gly Ser Ser Gly Thr Gly Ser Thr Gly Asn 
1025 1030 1035 

CAA AAC CCT GGG AGC CCT AGA CCT GGT AGT ACC GGA ACC TGG AAT CCT 3167 
Gin Asn Pro Gly Ser Pro Arg Pro Gly Ser Thr Gly Thr Trp Asn Pro 
1040 1045 1050 ' 1055 

GGC AGC TCT GAA CGC GGA AGT GCT GGG CAC TGG ACC TCT GAG AGC TCT 3215 
Gly Ser Ser Glu Arg Gly Ser Ala Gly His Trp Thr Ser Glu Ser Ser 
1060 1065 1070 

GTA TCT GGT AGT ACT GGA CAA TGG CAC TCT GAA TCT GGA AGT TTT AGG 3263 
Val Ser Gly Ser Thr Gly Gin Trp His Ser Glu Ser Gly Ser Phe Arg 
1075 1080 1085 

CCA GAT AGC CCA GGC TCT GGG AAC GCG AGG CCT AAC AAC CCA GAC TGG 3311 
Pro Asp Ser Pro Gly Ser Gly Asn Ala Arg Pro Asn Asn Pro Asp Trp 
1090 1095 1100 

GGC ACA TTT GAA GAG GTG TCA GGA AAT GTA AGT CCA GGG ACA AGG AGA 3359 
Gly Thr Phe Glu Glu Val Ser Gly Asn Val Ser Pro Gly Thr Arg Arq 
1105 mo 1115 

GAG TAG CAC ACA GAA AAA CTG GTC ACT AAA GGA GAT AAA GAG CTC AGG 3407 
Glu Tyr His Thr Glu Lys Leu Val Thr Lys Gly Asp Lys Glu Leu Arq 
H20 H25 1130 H35 
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ACT GGT AAA GAG AAG GTC ACC TCT GGT AGC ACA ACC ACC ACG CGT CGT 3455 

Thr Gly Lys Glu Lys Val Thr Ser Gly Ser Thr Thr Thr Thr Arg Arg 
1140 1145 1150 

TCA TGC TCT AAA ACC GTT ACT AAG ACT GTT ATT GGT CCT GAT GGT CAC 3503 

Ser Cys Ser Lys Thr Val Thr Lys Thr Val He Gly Pro Asp Gly His 

1155 1160 1165 

AAA GAA GTT ACC AAA GAA GTG GTG ACC TCC GAA GAT GGT TCT GAC TGT 3551 

Lys Glu Val Thr Lys Glu Val Val Thr Ser Glu Asp Gly Ser Asp Cys 

1170 1175 1180 

CCC GAG GCA ATG GAT TTA GGC ACA TTG TCT GGC ATA GGT ACT CTG GAT 3599 

Pro Glu Ala Met Asp Leu Gly Thr Leu Ser Gly He Gly Thr Leu Asp 
1185 1190 1195 

GGG TTC CGC CAT AGG CAC CCT GAT GAA GCT GCC TTC TTC GAC ACT GCC 3647 

Gly Phe Arg His Arg His Pro Asp Glu Ala Ala Phe Phe Asp Thr Ala 
1200 1205 1210 1215 

TCA ACT GGA AAA ACA TTC CCA GGT TTC TTC TCA CCT ATG TTA GGA GAG 3695 

Ser Thr Gly Lys Thr Phe Pro Gly Phe Phe Ser Pro Met Leu Gly Glu 
1220 1225 1230 

TTT GTC AGT GAG ACT GAG TCT AGG GGC TCA GAA TCT GGC ATC TTC ACA 3743 

Phe Val Ser Glu Thr Glu Ser Arg Gly Ser Glu Ser Gly He Phe Thr 

1235 1240 1245 

AAT ACA AAG GAA TCC AGT TCT CAT CAC CCT GGG ATA GCT GAA TTC CCT 3791 

Asn Thr Lys Glu Ser Ser Ser His His Pro Gly He Ala Glu Phe Pro 

1250 1255 . 1260 

TCC CGT GGT AAA TCT TCA AGT TAC AGC AAA CAA TTT ACT AGT AGC ACG 3839 

Ser Arg Gly Lys Ser Ser Ser Tyr Ser Lys Gin Phe Thr Ser Ser Thr 
1265 1270 1275 

AGT TAC AAC AGA GGA GAC TCC ACA TTT GAA AGC AAG AGC TAT AAA ATG 3887 

Ser Tyr Asn Arg Gly Asp Ser Thr Phe Glu Ser Lys Ser Tyr Lys Met 
1280 1285 1290 1295 

GCA GAT GAG GCC GGA AGT GAA GCC GAT CAT GAA GGA ACA CAT AGC ACC 3935 

Ala Asp Glu Ala Gly Ser Glu Ala Asp His Glu Gly Thr His Ser Thr 
1300 1305 1310 

AAG AGA GGC CAT GCT AAA TCT CGC CCT GTC AGA GGT ATC CAC ACT TCT 3983 

Lys Arg Gly His Ala Lys Ser Arg Pro Val Arg Gly He His Thr Ser 

1315 1320 1325 

CCT TTG GGG AAG CCT TCC CTG TCC CCC TAGACTAAGT TAAATAT 4027 
Pro Leu Gly Lys Pro Ser Leu Ser Pro 
1330 1335 
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(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1336 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

Met Ala Val Ser His Gly Arg Glu Ser Lys Pro Leu Thr Ala Gin Gin 
15 10 15 

Thr Thr Lys Leu Asp Ala Pro Thr Asn Leu Gin Phe Val Asn Glu Thr 
20 25 30 

Asp Ser Thr Val Leu Val Arg Trp Thr Pro Pro Arg Ala Gin He Thr 
35 40 45 

Gly Tyr Arg Leu Thr Val Gly Leu Thr Arg Arg Gly Gin Pro Arg Gin 
50 55 '60 J 

Tyr Asn Val Gly Pro Ser Val Ser Lys Tyr Pro Leu Arg Asn Leu Gin 
65 70 75 80 

Pro Ala Ser Glu Tyr Thr Val Ser Leu Val Ala He Lys Gly Asn Gin 
85 90 95 

Glu Ser Pro Lys Ala Thr Gly Val Phe Thr Thr Leu Gin Pro Gly Ser 
100 105 no 

Ser He Pro Pro Tyr Asn Thr Glu Val Thr Glu Thr Thr He Val He 
115 120 125 

Thr Trp Thr Pro Ala Pro Arg He Gly Phe Lys Leu Gly Val Arg Pro 
130 135 140 

Ser Gin Gly Gly Glu Ala Pro Arg Glu Val Thr Ser Asp Ser Gly Ser 
145 150 " 155 160 

He Val Val Ser Gly Leu Thr Pro Gly Val Glu Tyr Val Tyr Thr He 
165 170 175 

Gin Val Leu Arg Asp Gly Gin Glu Arg Asp Ala Pro He Val Asn Lys 
180 185 190 

Val Val Thr Pro Leu Ser Pro Pro Thr Asn Leu His Leu Glu Ala Asn 
195 200 205 

Pro Asp Thr Gly Val Leu Thr Val Ser Trp Glu Arg Ser Thr Thr Pro 
210 215 220 

Asp He Thr Gly Tyr Arg He Thr Thr Thr Pro Thr Asn Gly Gin Gin 
225 230 235 240 
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Gly Asn Ser Leu Glu Glu Val Val His Ala Asp Gin Ser Ser Cys Thr 
245 250 255 

Phe Asp Asn Leu Ser Pro Gly Leu Glu Tyr Asn Val Ser Val Tyr Thr 
260 265 270 

Val Lys Asp Asp Lys Glu Ser Val Pro He Ser Asp Thr He lie Pro 
275 280 285 

Glu Val Pro Gin Leu Thr Asp Leu Ser Phe Val Asp He Thr Asp Ser 
290 295 300 

Ser He Gly Leu Arg Trp Thr Pro Leu Asn Ser Ser Thr He He Gly 
305 310 315 320 

Tyr Arg He Thr Val Val Ala Ala Gly Glu Gly He Pro He Phe Glu 
325 330 335 

Asp Phe Val Tyr Ser Ser Val Gly Tyr Tyr Thr Val Thr Gly Leu Glu 
340 345 350 

Pro Gly He Asp Tyr Asp He Ser Val He Thr Leu He Asn Gly Gly 
355 360 365 

Glu Ser Ala Pro Thr Thr Leu Thr Gin Gin Thr Ala Val Pro Pro Pro 
370 375 380 

Thr Asp Leu Arg Phe Thr Asn He Gly Pro Asp Thr Met Arg Val Thr 
385 390 395 400 

Trp Ala Pro Pro Pro Ser He Asp Leu Thr Asn Phe Leu Val Arg Tyr 
405 410 415 

Ser Pro Val Lys Asn Glu Glu Asp Val Ala Glu Leu Ser He Ser Pro 
420 425 430 

Ser Asp Asn Ala Val Val Leu Thr Asn Leu Leu Pro Gly Thr Glu Tyr 
435 440 445 

Val Val Ser Val Ser Ser Val Tyr Glu Gin His Glu Ser Thr Pro Leu 
450 455 460 

Arg Gly Arg Gin Lys Thr Gly Leu Asp Ser Pro Thr Gly He Asp Phe 
465 470 475 * 480 

Ser Asp He Thr Ala Asn Ser Phe Thr Val His Trp He Ala Pro Arg 
485 490 495 

Ala Thr He Thr Gly Tyr Arg lie Arg His His Pro Glu His Phe Ser 
500 505 510 

Gly Arg Pro Arg Glu Asp Arg Val Pro His Ser Arg Asn Ser He Thr 
515 520 525 
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Leu Thr Asn Leu Thr Pro Gly Thr Glu Tyr Val Val Ser He Val Ala 
530 535 540 

Leu Asn Gly Arg Glu Glu Ser Pro Leu Leu He Gly Gin Gin Ser Thr 
545 550 555 ' 560 

Val Ser Asp Val Pro Arg Asp Leu Glu Val Val Ala Ala Thr Pro Thr 
565 570 575 

Ser Leu Leu He Ser Trp Asp Ala Pro Ala Val Thr Val Arg Tyr Tyr 
580 585 590 

Arg He Thr Tyr Gly Glu Thr Gly Gly Asn Ser Pro Val Gin Glu Phe 
595 600 605 

Thr Val Pro Gly Ser Lys Ser Thr Ala Thr He Ser Gly Leu Lys Pro 
610 615 620 

Gly Val Asp Tyr Thr He Thr Val Tyr Ala Val Thr Gly Arg Gly Asp 
525 630 635 640 

Ser Pro Ala Ser Ser Lys Pro He Ser He Asn Tyr Arg Thr Glu He 
645 650 655 

Asp Lys Pro Ser Gin Met Gin Val Thr Asp Val Gin Asp Asn Ser He 
660 665 670 

Ser Val Lys Trp Leu Pro Ser Ser Ser Pro Val Thr Gly Tyr Arg Val 
675 680 685 

Thr Thr Thr Pro Lys Asn Gly Pro Gly Pro Thr Lys Thr Lys Thr Ala 
690 695 700 

Gly Pro Asp Gin Thr Glu Met Thr He Glu Gly Leu Gin Pro Thr Val 
705 710 715 720 

Glu Tyr Val Val Ser Val Tyr Ala Gin Asn Pro Ser Gly Glu Ser Gin 
725 730 735 

Pro Leu Val Gin Thr Ala Val Thr Asn He Asp Arg Pro Lys Gly Leu 
740 745 750 

Ala Phe Thr Asp Val Asp Val Asp Ser He Lys He Ala Trp Glu Ser 
755 760 765 

Pro Gin Gly Gin Val Ser Arg Tyr Arg Val Thr Tyr Ser Ser Pro Glu 
770 775 " 780 

Asp Gly He His Glu Leu Phe Pro Ala Pro Asp Gly Glu Glu Asp Thr 
7 85 790 795 800 

Ala Glu Leu Gin Gly Leu Arg Pro Gly Ser Glu Tyr Thr Val Ser Val 
805 810 815 
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Val Ala Leu His Asp Asp Met Glu Ser Gin Pro Leu He Gly Thr Gin 
820 825 830 

Ser Thr Ala He Pro Ala Pro Thr Asp Leu Lys Phe Thr Gin Val Thr 
835 840 845 

Pro Thr Ser Leu Ser Ala Gin Trp Thr Pro Pro Asn Val Gin Leu Thr 
850 855 860 

Gly Tyr Arg Val Arg Val Thr Pro Lys Glu Lys Thr Gly Pro Met Lys 
865 870 875 880 

Glu He Asn Leu Ala Pro Asp Ser Ser Ser Val Val Val Ser Gly Leu 
885 890 895 

Met Val Ala Thr Lys Tyr Glu Val Ser Val Tyr Ala Leu Lys Asp Thr 
900 905 910 

Leu Thr Ser Arg Pro Ala Gin Gly Val Val Thr Thr Leu Glu Gly Gly 
915 920 925 

Asn Phe Lys Ser Gin Leu Gin Lys Val Pro Pro Glu Trp Lys Ala Leu 
930 935 940 

Thr Asp Met Pro Gin Met Arg Met Glu Leu Glu Arg Pro Gly Gly Asn 
945 950 " 955 960 

Glu He Thr Arg Gly Gly Ser Thr Ser Tyr Gly Thr Gly Ser Glu Thr 
965 970 975 

Glu Ser Pro Arg Asn Pro Ser Ser Ala Gly Ser Trp Asn Ser Gly Ser 
980 985 990 

Ser Gly Pro Gly Ser Thr Gly Asn Arg Asn Pro Gly Ser Ser Gly Thr 
995 1000 1005 

Gly Gly Thr Ala Thr Trp Lys Pro Gly Ser Ser Gly Pro Gly Ser Ala 
1010 1015 1020 

Gly Ser Trp Asn Ser Gly Ser Ser Gly Thr Gly Ser Thr Gly Asn Gin 
1025 1030 1035 " 1040 

Asn Pro Gly Ser Pro Arg Pro Gly Ser Thr Gly Thr Trp Asn Pro Gly 
1045 1050 1055 

Ser Ser Glu Arg Gly Ser Ala Gly His Trp Thr Ser Glu Ser Ser Val 
1060 1065 1070 

Ser Gly Ser Thr Gly Gin Trp His Ser Glu Ser Gly Ser Phe Arg Pro 
1075 1080 1085 

Asp Ser Pro Gly Ser Gly Asn Ala Arg Pro Asn Asn Pro Asp Trp Gly 
1090 1095 1100 
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Thr Phe Glu Glu Val Ser Gly Asn Val Ser Pro Gly Thr Arg Arg Glu 
1105 1110 1115 ' 1120 

Tyr His Thr Glu Lys Leu Val Thr Lys Gly Asp Lys Glu Leu Arg Thr 
1125 1130 1135 

Gly Lys Glu Lys Val Thr Ser Gly Ser Thr Thr Thr Thr Arg Arg Ser 
1140 1145 1150 

Cys Ser Lys Thr Val Thr Lys Thr Val He Gly Pro Asp Gly His Lys 
1155 1160 1155 

Glu Val Thr Lys Glu Val Val Thr Ser Glu Asp Gly Ser Asp Cys Pro 
1170 1175 1180 

Glu Ala Met Asp Leu Gly Thr Leu Ser Gly He Gly Thr Leu Asp Gly 
1185 1190 1195 1200 

Phe Arg His Arg His Pro Asp Glu Ala Ala Phe Phe Asp Thr Ala Ser 
1205 1210 1215 

Thr Gly Lys Thr Phe Pro Gly Phe Phe Ser Pro Met Leu Gly Glu Phe 
1220 1225 1230 

Val Ser Glu Thr Glu Ser Arg Gly Ser Glu Ser Gly He Phe Thr Asn 
1235 1240 1245 

Thr Lys Glu Ser Ser Ser His His Pro Gly lie Ala Glu Phe Pro Ser 
1250 1255 1260 

Arg Gly Lys Ser Ser Ser Tyr Ser Lys Gin Phe Thr Ser Ser Thr Ser 
1265 1270 1275 1280 

Tyr Asn Arg Gly Asp Ser Thr Phe Glu Ser Lys Ser Tyr Lys Met Ala 
1285 1290 1295 

Asp Glu Ala Gly Ser Glu Ala Asp His Glu Gly Thr His Ser Thr Lys 
1300 1305 1310 

Arg Gly His Ala Lys Ser Arg Pro Val Arg Gly He His Thr Ser Pro 
1315 1320 1325 



Leu Gly Lys Pro Ser Leu Ser Pro 
1330 1335 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC1551 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7 
GATCCCCGGG GAGCTCCTCG AGGCATG 
(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC1552 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8 
CCTCGAGGAG CTCCCCGGG 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC2052 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9 
AATTCACCAT GGCAGTGAGT 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC2053 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10: 
CATGACTCAC TGCCATGGTG 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC2491 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
CTAGATTAGA ATGGGGCC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC2493 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCATTCTAAT 



10 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: 2C3521 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TCGACTTAAG GACACTTTGA CAAGCAGACC AGCTCAGGGT GTTGTCACCA CTCTGGAGGG 60 
AGGAAATTTT AAGAGCCAGC TTCAGAAG 88 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC3522 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GTACCTTCTG AAGCTGGCTC TTAAAATTTC CTCCCTCCAG AGTGGTGACA ACACCCTGAG 60 
CTGGTCTGCT TGTCAAAGTG TCCTTAAG 88 
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I Claim: 

1. A hybrid protein comprising a tissue-binding 
domain from a first protein covalently linked to a cross- 
linking domain from a second protein. 

2. A hybrid protein according to claim 1 wherein 
the tissue-binding domain of the first protein is a heparin 
binding domain of thrombospondin, a heparin binding domain of 
fibronectin, a collagen binding domain of fibronectin or a 
cell binding domain of fibronectin. 

3. A hybrid protein according to claim 1 wherein 
the tissue-binding domain of the first protein comprises the 
amino acid sequence of Sequence ID No. 6 from Alanine, amino 
acid 2 to Glutamic acid, amino acid number 92 6. 

4. A hybrid protein according to claim 1 wherein 
the cross-linking domain of the second protein comprises the 
carboxy-terminal 103 amino acids of loricrin; the ten amino 
acid repeat beginning with glutamine, amino acid number 496 of 
involucrin; or the 400 amino-terminal amino acids of the 
fibrinogen or chain. 

5. A hybrid protein according to claim 1 wherein 
the cross-linking domain of the second protein comprises the 
amino acid sequence of Sequence ID No. 6 from Glycine, amino 
acid number 928 to Proline, amino acid number 133 6. 

6. A hybrid protein according to claim 1 
comprising the amino acid sequence of Sequence ID Number 6 
from alanine, amino acid number 2 to Proline, amino acid 
number 133 6. 

7. An isolated DNA molecule encoding a hybrid 
protein comprising a first DNA segment encoding a tissue- 
binding domain from a first protein joined to a second DNA 
segment encoding a cross-linking domain from a second protein. 
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8. A DNA molecule according to claim 7 wherein the 
first DNA segment encodes a heparin binding domain of 
thrombospondin, a heparin binding domain of fibronectin, a 
collagen binding domain of fibronectin, a collagen binding 
domain of fibronectin or a cell binding domain of fibronectin. 

9. A DNA molecule according to claim 7 wherein the 
first DNA segment comprises the nucleotide sequence of 
Sequence ID No. 5 from nucleotide 3 to nucleotide 2780. 

10. A DNA molecule according to claim 7 wherein the 
first DNA segment encodes the amino acid sequence of Sequence 
ID No. 6 from methionine, amino acid number 1 to glutamic 
acid, amino acid number 926. 

11. A DNA molecule according to claim 7 wherein the 
second DNA segment encodes the carboxy-terminal 103 amino 
acids of loricrin; the ten amino acid repeat beginning with 
glutamine, amino acid number 496 of involucrin; or the 400 
amino-terminal amino acids of the fibrinogen or chain. 

12. A DNA molecule according to claim 7 wherein the 
second DNA segment comprises the nucleotide sequence of 
Sequence ID No. 5 from nucleotide 2784 to nucleotide 4013. 

13. A DNA molecule according to claim 7 wherein the 
second DNA segment encodes the amino acid sequence of Sequence 
ID No. 6 from glycine, amino acid number 928 to proline, amino 
acid number 13 36. 

14. A DNA molecule according to claim 7 wherein the 
DNA molecule encodes the amino acid sequence of Sequence ID 
Number 6 from Methionine, amino acid number 1 to Proline, 
amino acid number 13 36. 
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15. A DNA molecule according to claim 7 wherein the 
DNA molecule comprises the nucleotide sequence of Sequence ID 
Number 5 from nucleotide 3 to nucleotide 4013. 

16. A DNA construct comprising a DNA molecule 
encoding a hybrid protein, wherein said DNA molecule comprises 
a first DNA segment encoding a tissue-binding domain from a 
first protein joined to a second DNA segment encoding a cross- 
linking domain from a second protein, and wherein said DNA 
molecule is operably linked to other DNA segments required for 
the expression of the DNA molecule. 

17. A DNA construct according to claim 16 wherein 
the first DNA segment encodes a heparin binding domain of 
thrombospondin, a heparin binding domain of fibronectin, a 
collagen binding domain of fibronectin or a cell binding 
domain of fibronectin. 

18. A DNA construct according to claim 16 wherein 
the first DNA segment comprises the nucleotide sequence of 
Sequence ID No. 5 from nucleotide 3 to nucleotide 2780. 

19. A DNA construct according to claim 16 wherein 
the first DNA segment encodes the amino acid sequence of 
Sequence ID No. 6 from methionine, amino acid 1 to Glutamic 
acid, amino acid number 926. 

20. A DNA construct according to claim 16 wherein 
the second DNA segment encodes the carboxy-terminal 103 amino 
acids of loricrin; the ten amino acid repeat beginning with 
glutamine, amino acid number 496 of involucrin; or the 400 
amino-terminal amino acids of the fibrinogen a chain. 

21. A DNA construct according to claim 16 wherein 
the second DNA segment comprises the nucleotide sequence of 
Sequence ID No . 5 from nucleotide 2784 to nucleotide 4013. 
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22. A DNA construct according to claim 16 wherein 
the second DNA segment encodes the amino acid sequence of 
Sequence ID No. 6 from glycine, amino acid number 928 to 
proline, amino acid number 1336. 

23. A DNA construct according to claim 16 wherein 
the DNA molecule comprises the nucleotide sequence of Sequence 
ID Number 5 from nucleotide l to nucleotide 4013. 

24. A DNA construct according to claim 16 wherein 
the DNA molecule encodes the amino acid sequence of Sequence 
ID Number 6 from Methionine, amino acid number l to Proline, 
amino acid number 1336. 

25. A host cell containing a DNA construct 
according to claim 16. 



26. A method for producing a hybrid protein 
comprising culturing a host cell according to claim 2 5 under 
conditions promoting the expression of the first DNA segment. 
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